Closed Bug 1079322 Opened 10 years ago Closed 10 years ago

[LogShake] LogParser.jsm broken on Kitkat

Categories

(Firefox OS Graveyard :: Hardware, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:2.2+, b2g-v2.0 unaffected, b2g-v2.0M unaffected, b2g-v2.1 unaffected, b2g-v2.2 verified)

VERIFIED FIXED
blocking-b2g 2.2+
Tracking Status
b2g-v2.0 --- unaffected
b2g-v2.0M --- unaffected
b2g-v2.1 --- unaffected
b2g-v2.2 --- verified

People

(Reporter: past, Assigned: gerard-majax)

References

Details

(Keywords: dogfood, regression, Whiteboard: [systemsfe])

Attachments

(2 files, 9 obsolete files)

Flame on v184 with shallow flashed gecko & gaia on top. I have "Shake to save system log" enabled, but when I do that I only see this in logcat:

E/QCALOG  (  306): [MessageQ] ProcessNewMessage: [XTWiFi-PE] unknown deliver target [OS-Agent]
D/wpa_supplicant( 1820): RX ctrl_iface - hexdump(len=11): 53 49 47 4e 41 4c 5f 50 4f 4c 4c
D/wpa_supplicant( 1820): wlan0: Control interface command 'SIGNAL_POLL'
D/wpa_supplicant( 1820): nl80211: survey data missing!
E/QCALOG  (  306): [MessageQ] ProcessNewMessage: [XT-CS] unknown deliver target [OS-Agent]
E/QCALOG  (  306): [MessageQ] ProcessNewMessage: [XTWWAN-PE] unknown deliver target [OS-Agent]

I also got the following once after the above messages, so it's likely unrelated, but here it is:

I/cat     (  238): <7>[ 3102.366278] DOM Worker: unhandled page fault (11) at 0xafc66628, code 0x00f
I/cat     (  238): <1>[ 3102.366301] pgd = ecb34000
E/QCALOG  (  306): [MessageQ] ProcessNewMessage: [XTWiFi-PE] unknown deliver target [OS-Agent]
I/cat     (  238): <1>[ 3102.367971] [afc66628] *pgd=2bd75831, *pte=317c56df, *ppte=317c5e5f
I/cat     (  238): <6>[ 3102.377165] 
I/cat     (  238): <6>[ 3102.377180] Pid: 694, comm:           DOM Worker
I/cat     (  238): <6>[ 3102.377196] CPU: 0    Tainted: G        W  O  (3.4.0-g0502ed9-00028-g508ee2d #1)
I/cat     (  238): <6>[ 3102.377215] PC is at 0xb5e33286
I/cat     (  238): <6>[ 3102.377226] LR is at 0xb5d9283b
I/cat     (  238): <6>[ 3102.377240] pc : [<b5e33286>]    lr : [<b5d9283b>]    psr: 400f0030
I/cat     (  238): <6>[ 3102.377247] sp : ae5ff820  ip : 00000000  fp : b186a358
I/cat     (  238): <6>[ 3102.377260] r10: b64b03c0  r9 : b368fa00  r8 : afc6663e
I/cat     (  238): <6>[ 3102.377274] r7 : 00000016  r6 : afc66628  r5 : 00000000  r4 : b186a358
I/cat     (  238): <6>[ 3102.377288] r3 : 00000000  r2 : ae5ff850  r1 : b186a358  r0 : b368fa00
I/cat     (  238): <6>[ 3102.377304] Flags: nZcv  IRQs on  FIQs on  Mode USER_32  ISA Thumb  Segment user
I/cat     (  238): <6>[ 3102.377320] Control: 10c5387d  Table: 2cb3406a  DAC: 00000015
I/cat     (  238): <6>[ 3102.377365] [<c010bd74>] (unwind_backtrace+0x0/0xf8) from [<c01118b8>] (__do_user_fault+0xbc/0x134)
I/cat     (  238): <6>[ 3102.377393] [<c01118b8>] (__do_user_fault+0xbc/0x134) from [<c088af48>] (do_page_fault+0x290/0x400)
I/cat     (  238): <6>[ 3102.377420] [<c088af48>] (do_page_fault+0x290/0x400) from [<c010029c>] (do_DataAbort+0x34/0x98)
I/cat     (  238): <6>[ 3102.377445] [<c010029c>] (do_DataAbort+0x34/0x98) from [<c08897b4>] (__dabt_usr+0x34/0x40)
I/cat     (  238): <6>[ 3102.377461] Exception stack(0xc379dfb0 to 0xc379dff8)
I/cat     (  238): <6>[ 3102.377478] dfa0:                                     b368fa00 b186a358 ae5ff850 00000000
I/cat     (  238): <6>[ 3102.377499] dfc0: b186a358 00000000 afc66628 00000016 afc6663e b368fa00 b64b03c0 b186a358
I/cat     (  238): <6>[ 3102.377517] dfe0: 00000000 ae5ff820 b5d9283b b5e33286 400f0030 ffffffff

Looking for logs/ after that doesn't come up with anything:

$ adb shell ls /sdcard/
DCIM
LOST.DIR
$ adb shell ls /storage/sdcard0/logs
/storage/sdcard0/logs: No such file or directory
$ adb shell ls /storage/sdcard1/logs
/storage/sdcard1/logs: No such file or directory
I don't have any issue on my Xperia ZR with current Gecko.
I have the same error message and no logs on any sdcard with:
Gaia-Rev        9050edcda308b65d86577c8ed0eedc5c568d8e44
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/0c8ae792f1c0
Build-ID        20141007160202
Version         35.0a1
Device-Name     flame-kk
FW-Release      4.4.2
FW-Incremental  34
FW-Date         Tue Sep 30 14:06:36 CST 2014
Bootloader      L1TC00011840
NO repro on JB though:
Gaia-Rev        0bc74ce502672cf0265b24cf3a25d117c3de5e71
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/e4cfacb76830
Build-ID        20141008040203
Version         35.0a1
Device-Name     flame
FW-Release      4.3
FW-Incremental  110
FW-Date         Fri Jun 27 15:57:58 CST 2014
Bootloader      L1TC00011230

Do you have any idea, James?
Component: Gaia::System → Hardware
Flags: needinfo?(hobinjk)
None of those log messages look like failure modes of the log capturer. My first idea would be that due to either user error or a differently calibrated accelerometer what you think is a shake is not considered a shaking motion by the listener.

The easiest way of testing this is to instrument https://mxr.mozilla.org/mozilla-central/source/b2g/components/LogShake.jsm#163
Flags: needinfo?(hobinjk)
(In reply to James Hobin from comment #4)
> None of those log messages look like failure modes of the log capturer. My
> first idea would be that due to either user error or a differently
> calibrated accelerometer what you think is a shake is not considered a
> shaking motion by the listener.
> 
> The easiest way of testing this is to instrument
> https://mxr.mozilla.org/mozilla-central/source/b2g/components/LogShake.
> jsm#163

I did assist johann in shaking, believe me we did it the proper way :)
Keywords: dogfood, regression
> 10-20 10:11:24.727   211   211 W GeckoConsole: [JavaScript Error: "RangeError: argument 1 accesses an index that is out of range" {file: "resource://gre/modules/LogParser.jsm" line: 240}]
James, I fear that with Kitkat all your parsing code has been broken because of https://android.googlesource.com/platform/bionic/+/4482d8513920d614e3799a214c5bad07b72580d2

Which advocates for using bionic's calls to read properties instead of re-doing the parsing on our own :)
Assignee: nobody → hobinjk
Flags: needinfo?(hobinjk)
Given that B2G 2.1 on Flame is supposed to be KK ...
blocking-b2g: --- → 2.2?
Summary: Shake to save system log doesn't work → [LogShake] LogParser.jsm broken on Kitkat
(In reply to Alexandre LISSY :gerard-majax from comment #7)
> James, I fear that with Kitkat all your parsing code has been broken because
> of
> https://android.googlesource.com/platform/bionic/+/
> 4482d8513920d614e3799a214c5bad07b72580d2
> 
> Which advocates for using bionic's calls to read properties instead of
> re-doing the parsing on our own :)

Oh no! If only I kept the patch where I used the native get_all_properties approach...it should be in the LogShake obsolete patches.
Flags: needinfo?(hobinjk)
Depends on: 1019816
Depends on: 1092074
Assignee: hobinjk → lissyx+mozillians
Attached patch logshake-kk-wip.patch (obsolete) — Splinter Review
A first WIP; not even tested yet.
Attached patch logshake-kk-wip.patch (obsolete) — Splinter Review
That boots and builds new ctypes binding, but b2g crashes on making the call to |this.property_list(callback, list);|
Attachment #8518920 - Attachment is obsolete: true
Attached patch logshake-kk-wip.patch (obsolete) — Splinter Review
Updated. As far as I can tell, from the gdb backtrace I could get when doing a test:
 - we properly call the property_list function from libcutils
 - it loops as expected
 - the segfault happens on the callback call at https://www.codeaurora.org/cgit/quic/la/platform/system/core/tree/libcutils/properties.c?h=kk_rb5#n68
Attachment #8518936 - Attachment is obsolete: true
David, I think you played quite a bit with js ctypes. Would you mind looking and making sure I made no mistake? Specifically regarding the way defining and using functions pointers.

We have property_list() prototype defined at https://www.codeaurora.org/cgit/quic/la/platform/system/core/tree/include/cutils/properties.h?h=kk_rb5#n51

And basically we want to do the same as getprop: https://www.codeaurora.org/cgit/quic/la/platform/system/core/tree/toolbox/getprop.c?h=kk_rb5
Flags: needinfo?(dteller)
Whiteboard: [systemsfe]
Attached file stacktrace (obsolete) —
Attached file dump @0xa9c10660 (obsolete) —
Comment on attachment 8518971 [details] [diff] [review]
logshake-kk-wip.patch

Review of attachment 8518971 [details] [diff] [review]:
-----------------------------------------------------------------

::: b2g/components/LogCapture.jsm
@@ +166,5 @@
> +
> +    let list = {};
> +
> +    debug("readProperties: calling ...");
> +    this.property_list(callback, list);

`list` looks fishy to me. `this.property_list` expects a void*, and I'm not sure that a random JS object can be converted to a `void*`. For testing purposes, I would rather use `null`.
I didn't see anything that looks bad in the code, except `list`. Now, I haven't used callbacks from C, so there may be pitfalls I don't know.
Flags: needinfo?(dteller)
(In reply to Alexandre LISSY :gerard-majax from comment #14)
> Created attachment 8519442 [details]
> stacktrace

The approach I used at first in LogShake was defined in https://bug1019816.bugzilla.mozilla.org/attachment.cgi?id=8439365

This uses the functions __system_property_find_nth and __system_property_read because I couldn't find a way to make a suitable function pointer.
(In reply to David Rajchenbach-Teller [:Yoric] (away until November 17th - use "needinfo") from comment #17)
> I didn't see anything that looks bad in the code, except `list`. Now, I
> haven't used callbacks from C, so there may be pitfalls I don't know.

As replied on IRC, I already tried with passing |null|, but that still makes the callback call segfaulting, as documented in the stacktrace in attachment 8519442 [details].
Passing |null| as the callback, in gdb I have a jump to 0x00000000:
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 5762.5762]
> 0x00000000 in ?? ()
> (gdb) bt
> #0  0x00000000 in ?? ()
> #1  0xb6e5ec64 in property_list_callback (pi=<optimized out>, cookie=0xbeebeb30) at system/core/libcutils/properties.c:68
> #2  0xb6ebb664 in foreach_property (off=16932, propfn=propfn@entry=0xb6e5ec41 <property_list_callback>, cookie=cookie@entry=0xbeebeb30) at bionic/libc/bionic/system_properties.c:686
> #3  0xb6ebb678 in foreach_property (off=16900, propfn=propfn@entry=0xb6e5ec41 <property_list_callback>, cookie=cookie@entry=0xbeebeb30) at bionic/libc/bionic/system_properties.c:689
> #4  0xb6ebb678 in foreach_property (off=16876, propfn=propfn@entry=0xb6e5ec41 <property_list_callback>, cookie=cookie@entry=0xbeebeb30) at bionic/libc/bionic/system_properties.c:689
> #5  0xb6ebb650 in foreach_property (off=13316, propfn=propfn@entry=0xb6e5ec41 <property_list_callback>, cookie=cookie@entry=0xbeebeb30) at bionic/libc/bionic/system_properties.c:678
> #6  0xb6ebb650 in foreach_property (off=20, propfn=propfn@entry=0xb6e5ec41 <property_list_callback>, cookie=cookie@entry=0xbeebeb30) at bionic/libc/bionic/system_properties.c:678
> #7  0xb6ebb678 in foreach_property (off=0, propfn=0xb6e5ec41 <property_list_callback>, propfn@entry=0xb6a311b0, cookie=0xbeebeb30) at bionic/libc/bionic/system_properties.c:689
> #8  0xb6ebbcfe in __system_property_foreach (propfn=0xb6a311b0, cookie=cookie@entry=0xbeebeb30) at bionic/libc/bionic/system_properties.c:708
> #9  0xb6e5ecb8 in property_list (propfn=<optimized out>, cookie=<optimized out>) at system/core/libcutils/properties.c:76
> #10 0xb5f00b84 in ffi_call_SYSV () from /home/alex/codaz/Mozilla/b2g/devices/Flame/B2G.KK/objdir-gecko/toolkit/library/libxul.so
> #11 0xb5f0019e in ffi_call () from /home/alex/codaz/Mozilla/b2g/devices/Flame/B2G.KK/objdir-gecko/toolkit/library/libxul.so
> #12 0xb5cad80c in js::ctypes::FunctionType::Call (cx=0xb6a311b0, argc=<optimized out>, vp=0xb3211178) at ../../../gecko/js/src/ctypes/CTypes.cpp:5914
Attached file oops in dmesg (obsolete) —
Making some progress:

> 11-11 16:48:17.740 26680 26680 I Gecko   : LogCapture.jsm: readProperties: E
> 11-11 16:48:17.740 26680 26680 I Gecko   : LogCapture.jsm: load: E
> 11-11 16:48:17.740 26680 26680 I Gecko   : LogCapture.jsm: readProperties: O
> 11-11 16:48:17.740 26680 26680 I Gecko   : LogCapture.jsm: readProperties: calling ...
> 11-11 16:48:17.740 26680 26680 I Gecko   : property_list_all: E
> 11-11 16:48:17.740 26680 26680 I Gecko   : property_list_all: C
> 11-11 16:48:17.740 26680 26680 I Gecko   : recordProperty: k=ctypes.char.array(32)([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
> 11-11 16:48:17.740 26680 26680 I Gecko   : recordProperty: n=ctypes.char.array(92)([102, 108, 97, 109, 101, 0, 101, 114, 105, 97, 108, 95, 115, 109, 100, 44, 115, 101, 114, 105, 97, 108, 95, 116, 116, 121, 44, 114, 109, 110, 101, 116, 95, 98, 97, 109, 44, 109, 97, 115, 115, 95, 115, 116, 111, 114, 97, 103, 101, 44, 97, 100, 98, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
> 11-11 16:48:17.740 26680 26680 I Gecko   : recordProperty: key=
> 11-11 16:48:17.740 26680 26680 I Gecko   : recordProperty: name=flame
> 11-11 16:48:17.740 26680 26680 I Gecko   : recordProperty: no key or dupe: ''
> 11-11 16:48:17.740 26680 26680 W GeckoConsole: [JavaScript Error: "TypeError: expected type pointer, got (void 0)" {file: "resource://gre/modules/systemlibs.js" line: 124}]
Attachment #8518971 - Attachment is obsolete: true
Attached patch logshake-kk-wip-v3.patch (obsolete) — Splinter Review
Just taking back the old code that makes use of libc calls. That's not as nice as using the libcutils ones, but at least it works :(.
Attachment #8519442 - Attachment is obsolete: true
Attachment #8519444 - Attachment is obsolete: true
Attachment #8520587 - Attachment is obsolete: true
The LogShake features allows one to shake its device to be able to dump
a set of useful logging informations. This includes several log files,
and also the Android properties. In the past, we relied on the
/dev/__properties__ file to extract their content by parsing its value.
This is duplicate work from the bionic libc and libcutils library.
Worst, the format used to store the values in this file has been changed
between JellyBean and Kitkat, so our parser was not able to dump the
values: that explains bug 1079322. To fix this we make use of some of
the underlying libc-defined functions used to iterate and get properties
values:
 - __system_property_find_nth() to retrieve one arbitrary property by
   its number (starting from 0), and this returns a struct containing
   all the informations
 - __system_property_read() to read the values contained in the struct
   that was previously retrieved
Comment on attachment 8520747 [details] [diff] [review]
Extract properties using libc functions r=gwagner

This has been successfully tested on:
 - Flame with KK base
 - Xperia ZR with JB base
 - Nexus S with ICS base
Attachment #8520747 - Flags: review?(anygregor)
The LogShake features allows one to shake its device to be able to dump
a set of useful logging informations. This includes several log files,
and also the Android properties. In the past, we relied on the
/dev/__properties__ file to extract their content by parsing its value.
This is duplicate work from the bionic libc and libcutils library.
Worst, the format used to store the values in this file has been changed
between JellyBean and Kitkat, so our parser was not able to dump the
values: that explains bug 1079322. To fix this we make use of some of
the underlying libc-defined functions used to iterate and get properties
values:
 - __system_property_find_nth() to retrieve one arbitrary property by
   its number (starting from 0), and this returns a struct containing
   all the informations
 - __system_property_read() to read the values contained in the struct
   that was previously retrieved
(In reply to Alexandre LISSY :gerard-majax from comment #27)
> pending try: https://tbpl.mozilla.org/?tree=Try&rev=8950d057363e

Failing because I forgot to update tests.
Attachment #8521349 - Flags: review?(anygregor)
Attachment #8520733 - Attachment is obsolete: true
Attachment #8520747 - Attachment is obsolete: true
Attachment #8520747 - Flags: review?(anygregor)
(In reply to Alexandre LISSY :gerard-majax from comment #30)
> New try: https://tbpl.mozilla.org/?tree=Try&rev=6a52d67f471d

Failing because the changes to LogCapture tests are not good: but the code works, according to the error :)
The LogShake features allows one to shake its device to be able to dump
a set of useful logging informations. This includes several log files,
and also the Android properties. In the past, we relied on the
/dev/__properties__ file to extract their content by parsing its value.
This is duplicate work from the bionic libc and libcutils library.
Worst, the format used to store the values in this file has been changed
between JellyBean and Kitkat, so our parser was not able to dump the
values: that explains bug 1079322. To fix this we make use of some of
the underlying libc-defined functions used to iterate and get properties
values:
 - __system_property_find_nth() to retrieve one arbitrary property by
   its number (starting from 0), and this returns a struct containing
   all the informations
 - __system_property_read() to read the values contained in the struct
   that was previously retrieved
Attachment #8521410 - Attachment is obsolete: true
Attachment #8521349 - Attachment is obsolete: true
Attachment #8521349 - Flags: review?(anygregor)
Attachment #8521410 - Attachment is obsolete: false
Attachment #8521410 - Flags: review?(anygregor)
(In reply to Alexandre LISSY :gerard-majax from comment #33)
> New try: https://tbpl.mozilla.org/?tree=Try&rev=3cd4deaeca08

Green.
Attachment #8521410 - Flags: review?(anygregor) → review+
https://hg.mozilla.org/mozilla-central/rev/5ff56ba2aeee
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Verified the issue fixed on 2.2 master Flame

After shaking the device the logfile is created and could be found by the $ adb shell ls /storage/sdcard0/logs command

"Flame 2.2

Device: Flame 2.2  (319mb)(Kitkat Base)(Shallow Flash)
Build ID: 20141125040209
Gaia: 824a61cccec4c69be9a86ad5cb629a1f61fa142f
Gecko: acde07cb4e4d
Version: 36.0a1 (2.2)
Firmware Version: v188-1
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0"
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
blocking-b2g: 2.2? → 2.2+
Blocks: 1019816
No longer depends on: 1019816
You need to log in before you can comment on or make changes to this bug.