Closed Bug 915850 Opened 11 years ago Closed 11 years ago

Strip weird characters from "info.adapterDescription" field in telemetry payloads

Categories

(Toolkit :: Telemetry, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla27

People

(Reporter: mreid, Assigned: froydnj)

Details

Attachments

(1 file)

We appear to be receiving characters in the info.adapterDescription field that cause the resulting JSON payloads not to parse (using Python's "simplejson" library).

An example of such a value is:
"adapterDescription":"Xadi[some bogus characters]ipov� sady 0 Mobile Intel(R) 945GM Express (Microsoft Corporation - WDDM)"

Here is the whole "info" part of the payload for one such record:
"info": {
    "flashVersion": "11.8.800.94",
    "addons": "%7B20a82645-c095-46ed-80e3-08825760534b%7D:0.0.0,%7B972ce4c6-7e08-4474-a285-3208198ce6fd%7D:23.0.1",
    "DWriteVersion": "0.0.0.0",
    "adapterDriverDate": "8-21-2006",
    "adapterDriverVersion": "7.14.10.1103",
    "hasSSE": true,
    "hasMMX": true,
    "version": "6.0",
    "arch": "x86",
    "memsize": 2038,
    "cpucount": 2,
    "locale": "cs",
    "revision": "http://hg.mozilla.org/releases/mozilla-release/rev/a55c55edf302",
    "reason": "saved-session",
    "OS": "WINNT",
    "appID": "{ec8030f7-c20a-464f-9b0e-13a3a9e97384}",
    "appVersion": "23.0.1",
    "appName": "Firefox",
    "appBuildID": "20130814063812",
    "appUpdateChannel": "release",
    "platformBuildID": "20130814063812",
    "hasSSE2": true,
    "hasSSE3": true,
    "hasSSSE3": true,
    "hasSSE4A": false,
    "hasSSE4_1": false,
    "hasSSE4_2": false,
    "hasEDSP": false,
    "hasARMv6": false,
    "hasARMv7": false,
    "hasNEON": false,
    "isWow64": false,
    "adapterDescription": "...snip...",
    "adapterVendorID": "0x8086",
    "adapterDeviceID": "0x27a2",
    "adapterRAM": "Unknown",
    "adapterDriver": "igdumd32"
}
(In reply to Mark Reid [:mreid] from comment #0)
> We appear to be receiving characters in the info.adapterDescription field
> that cause the resulting JSON payloads not to parse (using Python's
> "simplejson" library).
> 
> An example of such a value is:
> "adapterDescription":"Xadi[some bogus characters]ipov� sady 0 Mobile
> Intel(R) 945GM Express (Microsoft Corporation - WDDM)"

Can you provide the raw bytes here?
Flags: needinfo?(mreid)
I'm not 100% certain of the bytes for this one, since I extracted it from a log file that adds some other stuff, but here's what I'm seeing:

...
0013880: 6173 4e45 4f4e 223a 6661 6c73 652c 2269  asNEON":false,"i
0013890: 7357 6f77 3634 223a 6661 6c73 652c 2261  sWow64":false,"a
00138a0: 6461 7074 6572 4465 7363 7269 7074 696f  dapterDescriptio
00138b0: 6e22 3a22 5861 6469 0d20 0d69 706f 76ef  n":"Xadi. .ipov.
00138c0: bfbd 2073 6164 7920 3020 4d6f 6269 6c65  .. sady 0 Mobile
00138d0: 2049 6e74 656c 2852 2920 3934 3547 4d20   Intel(R) 945GM 
00138e0: 4578 7072 6573 7320 284d 6963 726f 736f  Express (Microso
00138f0: 6674 2043 6f72 706f 7261 7469 6f6e 202d  ft Corporation -
0013900: 2057 4444 4d29 222c 2261 6461 7074 6572   WDDM)","adapter
0013910: 5665 6e64 6f72 4944 223a 2230 7838 3038  VendorID":"0x808
0013920: 3622 2c22 6164 6170 7465 7244 6576 6963  6","adapterDevic
...


Another more recent value that's causing trouble, and for which I can easily get the raw bytes:
0000000: 2261 6461 7074 6572 4465 7363 7269 7074  "adapterDescript
0000010: 696f 6e22 3a22 2135 3c35 3941 4232 3e20  ion":"!5<59AB2> 
0000020: 3d30 313e 403e 3220 3c38 3a40 3e41 4535  =01>@>2 <8:@>AE5
0000030: 3c20 4d6f 6269 6c65 2049 6e74 656c 2852  < Mobile Intel(R
0000040: 2920 3435 2045 7870 7265 7373 2028 3a3e  ) 45 Express (:>
0000050: 403f 3e40 3046 384f 201c 3039 3a40 3e41  @?>@0F8O .09:@>A
0000060: 3e44 4220 2d20 5744 444d 2031 2e31 2922  >DB - WDDM 1.1)"
0000070: 0a                                       .
Flags: needinfo?(mreid)
I could have sworn we had this somewhere along the line and/or it was an
enhancement request bug somewhere.  Whatever the case, we should fix it.
Dunno if this will fix the Windows cases, but it DTRT in my testing.

We already save pings as UTF-8, we just weren't being careful to gzip them
as UTF-8.
Attachment #814991 - Flags: review?(vdjeric)
Comment on attachment 814991 [details] [diff] [review]
ensure that telemetry pings are sent as UTF-8

(In reply to Nathan Froyd (:froydnj) from comment #3)
> I could have sworn we had this somewhere along the line and/or it was an
> enhancement request bug somewhere.  

It was bug 769844. You closed it ;)

Follow up with mreid to verify this doesn't break anything.
Attachment #814991 - Flags: review?(vdjeric) → review+
If you can generate a payload with UTF-8 characters, it should be fairly easy to run the telemetry server locally to verify this, or I can spin up an AWS instance for you to use, just let me know.
The test changes included in the patch do fail without the extra encoding steps in TelemetryPing.js.  But I can try sync'ing with Mark tomorrow to make sure the changes work with an external server too.
https://hg.mozilla.org/integration/mozilla-inbound/rev/bc8470d22a26
Assignee: nobody → nfroyd
Flags: in-testsuite+
https://hg.mozilla.org/mozilla-central/rev/bc8470d22a26
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: