Closed Bug 1140937 Opened 10 years ago Closed 9 years ago

Change missing symbols cron to include filename and code_id

Categories

(Socorro :: Backend, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ted, Assigned: peterbe)

References

Details

Attachments

(3 files)

To fix bug 1140358 we need to be able to fetch the actual DLL file from Microsoft's symbol server, which means we need the filename and the code_id in the missing symbols report. We don't actually have code_id in the stackwalker JSON output currently, I just submitted a PR to add it: https://github.com/mozilla/socorro/pull/2652
This needs more work than I thought since we actually store the data in a missing_symbols table. The table schema needs to be updated to add these fields: https://github.com/mozilla/socorro/blob/8e8f9c802601fe5699c20aa0e9962bac84d1a680/socorro/external/postgresql/models.py#L1487 The processor rule that does the insert needs its SQL and parameters updated: https://github.com/mozilla/socorro/blob/f735c524510c75bca45895987e0c4321ad982743/socorro/processor/mozilla_transform_rules.py#L751 And then the cron job's SQL query needs updated to include the new columns: https://github.com/mozilla/socorro/blob/a8adba2ac34976bd1f34f073d818f3a8e3013b1b/scripts/crons/cron_missing_symbols.sh
Selena sez the database schema migration bits are easy: 'alembic revision --autogenerate -m "changing stuff"
Help me understand what remains here. Is it just to add `code_id` to the missing symbols table? If so, I can write a migration that adds that. Note-to-self; we need to edit models.py, cron_missing_symbols.sh and MissingSymbolsRule to also handle code_id.
See Also: → 1270190
Semi-fun fact; breakpad=> select count(*) from missing_symbols; count ------------ 2014832300
fetch-win32-symbols/symsrv-fetch.py [1] expects [debug_file, debug_id, code_file, code_id] in a row at 20xxxxxx-missing-symbols.txt, so I think not only code_id, but also code_file is needed. Would you fix this? [1] http://hg.mozilla.org/users/tmielczarek_mozilla.com/fetch-win32-symbols/file/tip/symsrv-fetch.py#l260
Flags: needinfo?(peterbe)
Comment 1 describes all the changes that need to be made, I believe. (I tracked everything down back then, but I'm not sure it's complete.)
Assignee: nobody → peterbe
Flags: needinfo?(peterbe)
The above mentioned PR only adds the migration so I can proceed with the code changes after. That makes it smoother and requires less accurate deployment timing.
Adrian, we need your help to push this further, would you please review the PR (attachment 8756928 [details] [review])? Thank you.
Flags: needinfo?(adrian)
Flags: needinfo?(adrian)
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/bfce78777199668e016db6232ef828ccc2320f8b bug 1140937 - Change missing symbols cron to include filename and code_id (#3358) r=adngdb
The migration has been run on STAGE. Here's how I did it. $ sudo yum -y remove socorro && sudo yum -y install socorro $ cd /data/socorro/application/ $ . /data/socorro/socorro-virtualenv/bin/activate $ envconsul -prefix socorro/common -sanitize -upcase env | grep POSTGRESQL $ env sqlalchemy.url=postgresql://breakpad_rw:**PASSWORD**@socorro-db.mocotoolsstaging.net/breakpad alembic -c /etc/socorro/alembic.ini current $ env sqlalchemy.url=postgresql://breakpad_rw:**PASSWORD**@socorro-db.mocotoolsstaging.net/breakpad alembic -c /etc/socorro/alembic.ini upgrade head
Commits pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/3cf89a3c73e1501335d63c016a5119a084557b32 bug 1140937 - update missing symbols rule to for code_* https://github.com/mozilla/socorro/commit/a90bd539b0ca763074a80f48073f8dd72cf12239 Merge pull request #3361 from peterbe/bug-1140937-update-missing-symbols-rule-to-for-code_ bug 1140937 - update missing symbols rule to for code_*
The new migration has been run on stage AND PROD now.
We have over 2 billion rows in our missing_symbols table. That makes it incredibly slow to do queries on that table. Just doing a `select count(*) from missing_symbols where date_processed = '2016-06-01';` takes minutes. If you don't mind, we can truncate that down to only keep the last 2 months worth. Step 2 is to move that to a dynamic query directly from the webapp. E.g. instead of https://crash-analysis.mozilla.com/crash_analysis/20160605/20160605-missing-symbols.txt it'd be something like https://crash-stats.mozilla.com/missing-symbols.csv?date=20160605 We'd generate the output on-the-fly. It's about 100k records per day. About 5-6Mb. So we don't want to cache it and to avoid ddos we'd put it behind an auth token. Thus, more hassle for you since you'd need to supply that token. Step 3 would be a cron job that truncates the table every now and then to only have the last 2 months worth of missing symbols. What do you think? Part of me is tempted to not care since what we have is working (but slow and resource hogging).
Flags: needinfo?(ted)
For the record, it takes 11minutes to run the missing symbol cron job on prod: [centos@prod-admin-i-ecb76837 ~]$ time psql -U breakpad_rw -h socorro-db.mocotoolsprod.net breakpad < peterbe__missing_symbols_20160601_orig.sql > peterbe__missing_symbols_20160601_orig.csv Password for user breakpad_rw: real 10m55.910s user 0m0.036s sys 0m0.020s Yikes. That's severe. :)
My script basically only cares about getting a sample of the last day's worth of missing symbols: http://hg.mozilla.org/users/tmielczarek_mozilla.com/fetch-win32-symbols/file/abab188e9b06/symsrv-fetch.py#l158 The specifics of the CSV files are not terribly important, it's just a small enough chunk of time to be manageable. If you want to turn this into a query I'd be fine with an API that didn't take any parameters and just returned the unique rows from the last N hours, or you could keep the table truncated on a cron and make the API return all unique rows from the table. Either of those would suffice. As I said on IRC, I suspect having 2B rows in that table is an accident. Maybe there was supposed to be a truncate job that never got implemented, or got lost along the way?
Flags: needinfo?(ted)
Commits pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/7646799eec82bb8d0243740442127ed7e822b4be bug 1140937 - it's "filename" not "code_file", r=luser https://github.com/mozilla/socorro/commit/7fe6593b46f2a91a4f5a4cc7bcfbe58d1847af14 Merge pull request #3365 from peterbe/bug-1140937-its-filename-not-code_file bug 1140937 - it's "filename" not "code_file"
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/ce0747596914c625225c420192fee1f66c1465be fixes bug 1140937 - write code_id and code_file to missing symbols (#3366) r=luser
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: