Closed Bug 1527228 Opened 10 months ago Closed 10 months ago

Remote profiling crashes GeckoView example app

Categories

(Core :: Gecko Profiler, defect, P1)

ARM
All
defect

Tracking

()

RESOLVED FIXED
mozilla67
Tracking Status
firefox-esr60 --- unaffected
firefox65 --- wontfix
firefox66 --- wontfix
firefox67 --- fixed

People

(Reporter: vchin, Assigned: mstange)

References

Details

(Keywords: regression)

Crash Data

Attachments

(2 files)

Remote profiling crashes the latest GeckoView example app. Every time I click stop to collect the profile, the app crashes and web IDE doesn’t give me a profile.

STR:

  1. Launch Geckoview Example App
  2. Launch WebIDE
  3. Start profiler (I used 10ms sampling with 180MB buffer size, no screen shots)
  4. Navigate to a page (I used cnn.com)
  5. Stop profiler

App crashes and WebIDE panel freezes.

Thank you for reporting Vicky!

Could you capture a logcat for the crash to see if it provides any details about the crash?

Flags: needinfo?(vchin)

Could not reproduce with Reference Browser 1.0.1907 on a OnePlus, debugging with 67.0a1 (2019-02-12) (64-bit); with and without Screenshots.

This also happens for me in today's reference browser build: https://crash-stats.mozilla.com/report/index/21eb77f8-fb1a-4e53-9c01-137420190212

Attached file profile_crash.out
Flags: needinfo?(vchin)

:mstange recommended to profile with Responsiveness turned off and it does NOT crash.

The crash report from comment 3 points at a line of code that stores a sample's responsiveness value in a Maybe<double> on the stack. I don't know why that line would crash! But it clearly does crash.

I can also confirm that disabling the Responsiveness profiler feature allows me to get profiles without crashing on the reference browser.

Component: Performance Tools (Profiler/Timeline) → Gecko Profiler
Priority: -- → P1
Product: DevTools → Core

This should be easy to debug for anyone who knows how to make a local Android build and how to attach a debugger to a child process on the device. (I don't know how to do the latter.)

Crash Signature: [@ mozilla::Maybe<T>::emplace<T> ]

Here's how far I've gotten:
I'm looking at the crash report https://crash-stats.mozilla.com/report/index/21eb77f8-fb1a-4e53-9c01-137420190212 .
I want to obtain the arm disassembly for the two functions void mozilla::Maybe<double>::emplace<double const&>(double const&) (_ZN7mozilla5MaybeIdE7emplaceIJRKdEEEvDpOT_) and StreamSamplesAndMarkers(char const*, int, ProfileBuffer const&, SpliceableJSONWriter&, mozilla::TimeStamp const&, mozilla::TimeStamp const&, mozilla::TimeStamp const&, double, UniqueStacks&) (_Z23StreamSamplesAndMarkersPKciRK13ProfileBufferR20SpliceableJSONWriterRKN7mozilla9TimeStampES9_S9_dR12UniqueStacks).
The address ranges of these two functions in the libxul.so binary are 0x007fc348 to 0x007fc380 and 0x01b84bdc to 0x01b858e8.

The mozilla-central revision for this build is 3a3e393396f418df1490aa0832d0c54fc353d522 and the breakpad ID for libxul.so is 792F88D450818DE2A773C6E8142DB30A0.

The original libxul.so file for this build is in the target.crashreporter-symbols-full.zip file of the Nightly "Android 4.0 API16+ opt" build at https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=ec61d092ed676663850f86d6f1db0d899f399907&selectedJob=227383417 : https://queue.taskcluster.net/v1/task/RJGad5vgR46jLXQiH5ClXA/runs/0/artifacts/public/build/en-US/target.crashreporter-symbols-full.zip
Unzip the zip file, go to libxul.so/792F88D450818DE2A773C6E8142DB30A0/, uncompress libxul.so.dbg.gz and rename the result to libxul.so.

If I load this file in Hopper, it doesn't detect the .text section, and it displays all bytes in the relevant functions as 0x00.
If I load this file in lldb I don't have any success either:

$ lldb
(lldb) file /Users/mstange/Downloads/libxul.so
Current executable set to '/Users/mstange/Downloads/libxul.so' (arm).
(lldb) disassemble -n _ZN7mozilla5MaybeIdE7emplaceIJRKdEEEvDpOT_
error: error reading data from section .text
error: Unable to find symbol with name '_ZN7mozilla5MaybeIdE7emplaceIJRKdEEEvDpOT_'.
(lldb) disassemble -s 0x007fc348
error: error reading data from section .text
error: Failed to disassemble memory at 0x007fc348.

And objdump doesn't give me any disassembly either:

$ ~/code/obj-llvm/bin/llvm-objdump -disassemble -g -start-address=0x007fc348 -stop-address=0x007fc380 /Users/mstange/Downloads/libxul.so

/Users/mstange/Downloads/libxul.so:	file format ELF32-arm-little

$

The above problem was because the binary in the target.crashreporter-symbols-full.zip is not the true binary; it only contains the debug information.

Anyway, jrmuizel helped me find the the problem: We're getting a SIGBUS on the dereference from the profiler buffer. And there's this comment above the definition of ProfileBufferEntry:

// NB: Packing this structure has been shown to cause SIGBUS issues on ARM.
#if !defined(GP_ARCH_arm)
#  pragma pack(push, 1)
#endif

class ProfileBufferEntry {

So it looks like we've started packing this structure on arm recently. This would happen if GP_ARCH_arm wasn't defined here even though we were building for arm. And, lo and behold, the GP_ARCH_arm definition is located in PlatformMacros.h, which is not included in this file! So I think this #if check only worked by accident in the past, and bug 1520103 accidentally broke it by moving some #includes around.

Assignee: nobody → mstange
Blocks: 1520103
Status: NEW → ASSIGNED
Keywords: regression
OS: Unspecified → All
Hardware: Unspecified → ARM
Pushed by mstange@themasta.com:
https://hg.mozilla.org/integration/autoland/rev/2da67ecb372b
Include PlatformMacros.h in order to correctly pick up the GP_ARCH_arm define and turn off packing of the ProfileBufferEntry struct, so that we don't do unaligned accesses on ARM. r=gerald
Status: ASSIGNED → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
You need to log in before you can comment on or make changes to this bug.