Closed Bug 1325388 Opened 7 years ago Closed 7 years ago

verify socorro collector and antenna produce the same files on S3 [antenna]

Categories

(Socorro :: Antenna, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

Attachments

(1 file)

It's important that Antenna generates raw_crash files that are identical to the ones the Socorro collector is generating.

Given that, we need to pass a crash through the Socorro collector and through Antenna and then verify that the files that get saved in S3 are identical minus the timestamp fields which will be different.
Grabbing this to do next year. Pretty sure it's not hard to do--it's just setting things up right.
Assignee: nobody → willkg
Status: NEW → ASSIGNED
Took a while to get to this. Bleh.

Anyhow, they produce the exact same files with two differences:

1. The raw_crash file that Antenna produces has an additional "percentage" field.

2. The raw_crash file that Antenna produces sometimes has a "legacy_processing" field that's "1" whereas the collector counterpart has a "0".

I'm going to chase these down today and figure out whether they're problematic and whether we need to make changes.
Er, going back a bit, what I did was go through crashes in the -stage s3 bucket and pull out a bunch that cover the various dump files (e.g. memory_report. upload_file_minidump, upload_file_minidump*). I also used crashes produced after the pseudo-filename changes for raw crash files.

Crash ids used:

000000a6-5695-44f2-9bff-67b392160921
00000019-c413-4ada-a50b-4eab42170120
0000062b-dc96-414f-9f85-4fbfa2160902
000015cd-6bdb-4d4f-ba14-d8ecd2161105
00000ad5-4f09-465e-835b-dae982170113
00000113-f38d-42e8-a68e-958f22161207
0001285d-c2e2-49c1-8ba9-a339b2170117

Is this set truly representative? Definitely not--but my intuition suggests it's probably good enough. I figured if I saw any problems in these, I'd do a more thorough examination.

Then I wrote a crash poster that took the files, did some minor munging just like what the stage submitter does, assemble that into an HTTP POST payload and POST that to the specified url. I had a bunch of this code written already, but I redid parts of it to make it easier to use for this investigation.

https://github.com/mozilla/antenna/blob/587dea0574003832fc2686aa9d1eea6c9a6743fb/testlib/mini_poster.py

Then I dockerized the Socorro collector and set it up using a prod-like configuration. I had done most of this a while back, but spent time updating the infrastructure, making sure it was correct and rigging it to work with this investigation.

https://github.com/willkg/socorro-zero

Then I ran the Socorro collector and Antenna, posted all the crashes and compared the output in the fakes3 directories using a script that diffs the raw crash files, verifies the dump_names files are the same and verifies we saved all the appropriate dump files.

That's it!
There is a difference between Collector and Antenna: if the crash had legacy_processing=0, the Collector will save and reprocess it. However, Antenna is additionally looking for a percentage value and since it's not in these crashes, it redoes the throttling and gets a different throttle result (0 is ACCEPT, 1 is DEFER).

There's one other difference. In Collector, the throttle_rate is an int. However, in Antenna, it's being saved as a string. This has the same value that percentage has.

Given that, I'm going to:

1. switch from saving percentage to saving throttle_rate in raw_crash

2. switch from checking percentage to checking throttle_rate for throttle directions

3. make sure we save throttle_rate as an int
Landed it in master. Marking as FIXED.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Switching Antenna bugs to Antenna component.
Component: General → Antenna
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: