Closed
Bug 911585
Opened 12 years ago
Closed 12 years ago
Missing crash dumps since 2013-08-28
Categories
(Socorro :: Backend, task)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: bc, Assigned: rhelmer)
References
Details
The crash dumps stopped being produced on fs1 on 2013-08-27.
See bug 794176 for some background information when this has occurred in the past.
ssh fs1.corpdmz.scl3.mozilla.com 'ls /data/security_group/crash_urls/'
20130816-crashdata.csv.gz
20130817-crashdata.csv.gz
20130818-crashdata.csv.gz
20130819-crashdata.csv.gz
20130820-crashdata.csv.gz
20130821-crashdata.csv.gz
20130822-crashdata.csv.gz
20130823-crashdata.csv.gz
20130824-crashdata.csv.gz
20130825-crashdata.csv.gz
20130826-crashdata.csv.gz
20130827-crashdata.csv.gz
| Reporter | ||
Comment 1•12 years ago
|
||
Still not getting new crash dumps. Was the related to the outage of crash-stats on 2013-08-28 due to a configuration change?
Updated•12 years ago
|
Assignee: server-ops-webops → nobody
Group: mozilla-corporation-confidential
Component: Server Operations: Web Operations → Backend
Product: mozilla.org → Socorro
QA Contact: nmaul
Updated•12 years ago
|
Assignee: nobody → rhelmer
| Assignee | ||
Comment 2•12 years ago
|
||
Ah ok so we took care of this in the socorro repo, but this job is actually running out of this location (not from the Socorro install):
/data/bin/cron_daily_reports.sh
Solarce can you help us change this please? All you should need to do is change "python" to "$PYTHON" inside that script, the include file points to the right place (which is now a virtualenv)
| Assignee | ||
Comment 3•12 years ago
|
||
Sorry mid-air ate my needinfo? request, solarce please see below:
(In reply to Robert Helmer [:rhelmer] from comment #2)
> Ah ok so we took care of this in the socorro repo, but this job is actually
> running out of this location (not from the Socorro install):
>
> /data/bin/cron_daily_reports.sh
>
> Solarce can you help us change this please? All you should need to do is
> change "python" to "$PYTHON" inside that script, the include file points to
> the right place (which is now a virtualenv)
Status: NEW → ASSIGNED
Flags: needinfo?(bburton)
Comment 4•12 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #3)
> Sorry mid-air ate my needinfo? request, solarce please see below:
>
> (In reply to Robert Helmer [:rhelmer] from comment #2)
> > Ah ok so we took care of this in the socorro repo, but this job is actually
> > running out of this location (not from the Socorro install):
> >
> > /data/bin/cron_daily_reports.sh
> >
> > Solarce can you help us change this please? All you should need to do is
> > change "python" to "$PYTHON" inside that script, the include file points to
> > the right place (which is now a virtualenv)
Do you think we should switch the cron job to run the script from the open source repo? https://github.com/mozilla/socorro/blob/master/scripts/crons/cron_daily_reports.sh
Flags: needinfo?(bburton)
| Assignee | ||
Comment 5•12 years ago
|
||
(In reply to Brandon Burton [:solarce] from comment #4)
> (In reply to Robert Helmer [:rhelmer] from comment #3)
> > Sorry mid-air ate my needinfo? request, solarce please see below:
> >
> > (In reply to Robert Helmer [:rhelmer] from comment #2)
> > > Ah ok so we took care of this in the socorro repo, but this job is actually
> > > running out of this location (not from the Socorro install):
> > >
> > > /data/bin/cron_daily_reports.sh
> > >
> > > Solarce can you help us change this please? All you should need to do is
> > > change "python" to "$PYTHON" inside that script, the include file points to
> > > the right place (which is now a virtualenv)
>
> Do you think we should switch the cron job to run the script from the open
> source repo?
> https://github.com/mozilla/socorro/blob/master/scripts/crons/
> cron_daily_reports.sh
So this has actually been replaced *twice* and we apparently still have not switched over ;) I am not sure that the one above has been tested, not anytime lately at least. We'd need to configure all the locations to push to etc. too which I am sure is not done.
The latest is a crontabber job, I'd rather switch to that one really, and not waste the config/testing effort on the above one. I'll file a bug for this.
For now though I think we should fix the old one in-place (if you don't mind and agree), and I can handle backfilling the missing days.
Comment 6•12 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #5)
> (In reply to Brandon Burton [:solarce] from comment #4)
> > (In reply to Robert Helmer [:rhelmer] from comment #3)
> > > Sorry mid-air ate my needinfo? request, solarce please see below:
> > >
> > > (In reply to Robert Helmer [:rhelmer] from comment #2)
> > > > Ah ok so we took care of this in the socorro repo, but this job is actually
> > > > running out of this location (not from the Socorro install):
> > > >
> > > > /data/bin/cron_daily_reports.sh
> > > >
> > > > Solarce can you help us change this please? All you should need to do is
> > > > change "python" to "$PYTHON" inside that script, the include file points to
> > > > the right place (which is now a virtualenv)
> >
> > Do you think we should switch the cron job to run the script from the open
> > source repo?
> > https://github.com/mozilla/socorro/blob/master/scripts/crons/
> > cron_daily_reports.sh
>
> So this has actually been replaced *twice* and we apparently still have not
> switched over ;) I am not sure that the one above has been tested, not
> anytime lately at least. We'd need to configure all the locations to push to
> etc. too which I am sure is not done.
>
> The latest is a crontabber job, I'd rather switch to that one really, and
> not waste the config/testing effort on the above one. I'll file a bug for
> this.
>
> For now though I think we should fix the old one in-place (if you don't mind
> and agree), and I can handle backfilling the missing days.
Sounds good, I'll fix in Puppet's svn tree and have pushed live shortly
Comment 7•12 years ago
|
||
Change committed and pushed
-> % svn diff
svn ci -Index: modules/socorro/files/prod/data-bin/cron_daily_reports.sh
===================================================================
--- modules/socorro/files/prod/data-bin/cron_daily_reports.sh (revision 74408)
+++ modules/socorro/files/prod/data-bin/cron_daily_reports.sh (working copy)
@@ -14,7 +14,7 @@
fi
SCRIPT_RUN_DATE=`date -d "$REPORT_DATE" '+%Y-%m-%d'`
-python /data/socorro/application/scripts/startDailyUrl.py --day=$SCRIPT_RUN_DATE
+$PYTHON /data/socorro/application/scripts/startDailyUrl.py --day=$SCRIPT_RUN_DATE
DATA_FILE=`date -d "$REPORT_DATE" '+%Y%m%d-crashdata.csv.gz'`
scp ${HOME}/${DATA_FILE} bacula@10.22.72.131:/data/security_group/crash_urls/
bburton@althalus [01:57:45] [~/code/mozilla/sysadmins/puppet/trunk]
-> % svn ci -m "updating python binary path to use virtualenv, bug 911585"
Sending trunk/modules/socorro/files/prod/data-bin/cron_daily_reports.sh
Transmitting file data .
Committed revision 74409.
Info: Applying configuration version '74410'
Info: FileBucket adding {md5}97c878b7df5c3e29beab0a82ddfd9ab7
Info: /File[/data/bin/cron_daily_reports.sh]: Filebucketed /data/bin/cron_daily_reports.sh to main with sum 97c878b7df5c3e29beab0a82ddfd9ab7
Notice: /File[/data/bin/cron_daily_reports.sh]/content: content changed '{md5}97c878b7df5c3e29beab0a82ddfd9ab7' to '{md5}66c2fe8f6889ba13d9414cc83c6ab4b1'
Notice: Finished catalog run in 198.27 seconds
Script ran successfully, sensitive info REDACTED
[socorro@sp-admin01.phx1 ~]$ bash -x /data/bin/cron_daily_reports.sh
+ [[ -f /tmp/daily_urls.lock ]]
+ touch /tmp/daily_urls.lock
+ . /etc/socorro/socorrorc
++ SOCORRO_DIR=/data/socorro
++ APPDIR=/data/socorro/application
++ PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/socorro/bin:/home/socorro/bin:/home/socorro/python_extras/bin
++ PYTHONPATH=/home/socorro/python_extras/lib:/data/socorro/application
++ PYTHON=/data/socorro/socorro-virtualenv/bin/python
++ SOCORRO_CONFIG=/etc/socorro/common.conf
++ export APPDIR PATH PYTHONPATH SOCORRO_CONFIG
++ '[' -f /etc/socorro/common.conf ']'
++ . /etc/socorro/common.conf
+++ export databaseHost=REDACTED
+++ databaseHost=REDACTED
+++ export databaseName=REDACTED
+++ databaseName=REDACTED
+++ export databaseUserName=REDACTED
+++ databaseUserName=REDACTED
+++ export databasePassword=REDACTED
+++ databasePassword=REDACTED
+++ export hbaseHost=REDACTED
+++ hbaseHost=REDACTED
+++ export hbaseTimeout=5000
+++ hbaseTimeout=5000
+++ export stderrErrorLoggingLevel=10
+++ stderrErrorLoggingLevel=10
+++ export syslogErrorLoggingLevel=10
+++ syslogErrorLoggingLevel=10
+++ export persistentDataPathname=/home/socorro/persistent/bugzilla.pickle
+++ persistentDataPathname=/home/socorro/persistent/bugzilla.pickle
+++ export localFS=/home/socorro/primaryCrashStore
+++ localFS=/home/socorro/primaryCrashStore
+++ export fallbackFS=/home/socorro/primaryCrashStore
+++ fallbackFS=/home/socorro/primaryCrashStore
+++ export hbaseFallbackFS=/home/socorro/primaryCrashStore
+++ hbaseFallbackFS=/home/socorro/primaryCrashStore
+++ export temporaryFileSystemStoragePath=/home/socorro/temp
+++ temporaryFileSystemStoragePath=/home/socorro/temp
+++ export minidump_stackwalkPathname=/data/socorro/stackwalk/bin/stackwalk.sh
+++ minidump_stackwalkPathname=/data/socorro/stackwalk/bin/stackwalk.sh
+++ export syslogFacilityString=local2
+++ syslogFacilityString=local2
+++ export processorSymbolsPathnameList=/mnt/socorro/symbols/symbols_ffx,/mnt/socorro/symbols/symbols_sea,/mnt/socorro/symbols/symbols_tbrd,/mnt/socorro/symbols/symbols_mob,/mnt/socorro/symbols/symbols_penelope,/mnt/socorro/symbols/symbols_sbrd,/mnt/socorro/symbols/symbols_camino,/mnt/socorro/symbols/symbols_os,/mnt/socorro/symbols/symbols_solaris,/mnt/socorro/symbols/symbols_opensuse,/mnt/socorro/symbols/symbols_ubuntu,/mnt/socorro/symbols/symbols_fedora,/mnt/socorro/symbols/symbols_adobe,/mnt/socorro/symbols/symbols_b2g
+++ processorSymbolsPathnameList=/mnt/socorro/symbols/symbols_ffx,/mnt/socorro/symbols/symbols_sea,/mnt/socorro/symbols/symbols_tbrd,/mnt/socorro/symbols/symbols_mob,/mnt/socorro/symbols/symbols_penelope,/mnt/socorro/symbols/symbols_sbrd,/mnt/socorro/symbols/symbols_camino,/mnt/socorro/symbols/symbols_os,/mnt/socorro/symbols/symbols_solaris,/mnt/socorro/symbols/symbols_opensuse,/mnt/socorro/symbols/symbols_ubuntu,/mnt/socorro/symbols/symbols_fedora,/mnt/socorro/symbols/symbols_adobe,/mnt/socorro/symbols/symbols_b2g
+++ export primaryStorageClass=socorro.storage.crashstorage.CrashStorageSystemForLocalFS
+++ primaryStorageClass=socorro.storage.crashstorage.CrashStorageSystemForLocalFS
+++ export wsgiInstallation=True
+++ wsgiInstallation=True
+++ export modwsgiInstallation=True
+++ modwsgiInstallation=True
+++ export WSGIPythonPath=/data/socorro/application:/data/socorro/thirdparty:/data/socorro/application/scripts:/usr/lib64/python2.6/site-packages
+++ WSGIPythonPath=/data/socorro/application:/data/socorro/thirdparty:/data/socorro/application/scripts:/usr/lib64/python2.6/site-packages
+++ export smtpHostname=smtp.socketlabs.com
+++ smtpHostname=smtp.socketlabs.com
+++ export smtpPort=25
+++ smtpPort=25
+++ export smtpUsername=REDACTED
+++ smtpUsername=REDACTED
+++ export smtpPassword=REDACTED
+++ smtpPassword=REDACTED
+++ export fromEmailAddress=no-reply@crash-stats.mozilla.com
+++ fromEmailAddress=no-reply@crash-stats.mozilla.com
+++ export unsubscribeBaseUrl=http://crash-stats.mozilla.com/email/subscription/%s
+++ unsubscribeBaseUrl=http://crash-stats.mozilla.com/email/subscription/%s
+++ export product_uris=firefox/nightly/latest-mozilla-1.9.1/,firefox/nightly/latest-mozilla-1.9.2/,firefox/nightly/latest-mozilla-central/,seamonkey/nightly/latest-comm-1.9.1/,seamonkey/nightly/latest-comm-central-trunk/,thunderbird/nightly/latest-comm-1.9.2/,thunderbird/nightly/latest-comm-central/,mobile/nightly/latest-mobile-1.9.2/,mobile/nightly/latest-mobile-trunk/,camino/nightly/latest-2.1-M1.9.2/
+++ product_uris=firefox/nightly/latest-mozilla-1.9.1/,firefox/nightly/latest-mozilla-1.9.2/,firefox/nightly/latest-mozilla-central/,seamonkey/nightly/latest-comm-1.9.1/,seamonkey/nightly/latest-comm-central-trunk/,thunderbird/nightly/latest-comm-1.9.2/,thunderbird/nightly/latest-comm-central/,mobile/nightly/latest-mobile-1.9.2/,mobile/nightly/latest-mobile-trunk/,camino/nightly/latest-2.1-M1.9.2/
+++ export persistentBrokenDumpPathname=/home/socorro/persistent/fixbrokendumps.pickle
+++ persistentBrokenDumpPathname=/home/socorro/persistent/fixbrokendumps.pickle
+++ export brokenFennecFixer=/data/bin/minidump_hack-fennec
+++ brokenFennecFixer=/data/bin/minidump_hack-fennec
+++ export brokenFirefoxLinuxFixer=/data/bin/minidump_hack-firefox_linux
+++ brokenFirefoxLinuxFixer=/data/bin/minidump_hack-firefox_linux
+++ export elasticSearchHostname=
+++ elasticSearchHostname=
+++ export searchImplClass=socorro.search.postgresql.PostgresAPI
+++ searchImplClass=socorro.search.postgresql.PostgresAPI
+++ export product=Firefox,Fennec,FennecAndroid
+++ product=Firefox,Fennec,FennecAndroid
+++ export statsdHost=graphite1.dmz.phx1.mozilla.com
+++ statsdHost=graphite1.dmz.phx1.mozilla.com
+++ export statsdPrefix=socorro-prod
+++ statsdPrefix=socorro-prod
+++ export brokenBoot2GeckoFixer=/data/bin/minidump_hack-b2g
+++ brokenBoot2GeckoFixer=/data/bin/minidump_hack-b2g
+ REPORT_DATE='1 days ago'
+ '[' -n '' ']'
++ date -d '1 days ago' +%Y-%m-%d
+ SCRIPT_RUN_DATE=2013-09-03
+ /data/socorro/socorro-virtualenv/bin/python /data/socorro/application/scripts/startDailyUrl.py --day=2013-09-03
2013-09-04 14:23:16,186 INFO - current configuration:
2013-09-04 14:23:16,187 INFO - databaseHost=REDACTED
2013-09-04 14:23:16,187 INFO - databaseName=REDACTED
2013-09-04 14:23:16,187 INFO - databasePassword=REDACTED
2013-09-04 14:23:16,188 INFO - databaseUserName=REDACTED
2013-09-04 14:23:16,188 INFO - day=2013-09-03 00:00:00+00:00
2013-09-04 14:23:16,188 INFO - outputPath=.
2013-09-04 14:23:16,188 INFO - product=Firefox,Fennec,FennecAndroid
2013-09-04 14:23:16,189 INFO - publicOutputPath=.
2013-09-04 14:23:16,189 INFO - stderrErrorLoggingLevel=10
2013-09-04 14:23:16,189 INFO - stderrLineFormatString=%(asctime)s %(levelname)s - %(message)s
2013-09-04 14:23:16,189 INFO - syslogErrorLoggingLevel=10
2013-09-04 14:23:16,190 INFO - syslogFacilityString=local2
2013-09-04 14:23:16,190 INFO - syslogHost=localhost
2013-09-04 14:23:16,190 INFO - syslogLineFormatString=Socorro Daily URL (pid %(process)d): %(asctime)s %(levelname)s - %(threadName)s - %(message)s
2013-09-04 14:23:16,190 INFO - syslogPort=514
2013-09-04 14:23:16,191 INFO - utc_now=<function utc_now at 0x7f952461c398>
2013-09-04 14:23:16,191 INFO - version=
2013-09-04 14:23:16,191 INFO -
2013-09-04 14:23:16,198 DEBUG - config.day = 2013-09-03 00:00:00+00:00; now = 2013-09-04 00:00:00+00:00; yesterday = 2013-09-03 00:00:00+00:00
2013-09-04 14:23:16,198 DEBUG - config.day = 2013-09-03 00:00:00+00:00; now = 2013-09-04; yesterday = 2013-09-03
2013-09-04 14:23:16,198 DEBUG - SQL is:
select
r.signature, -- 0
r.url, -- 1
'http://crash-stats.mozilla.com/report/index/' || r.uuid as uuid_url, -- 2
to_char(r.client_crash_date,'YYYYMMDDHH24MI') as client_crash_date, -- 3
to_char(r.date_processed,'YYYYMMDDHH24MI') as date_processed, -- 4
r.last_crash, -- 5
r.product, -- 6
r.version, -- 7
r.build, -- 8
'' as branch, -- 9
r.os_name, --10
r.os_version, --11
r.cpu_name || ' | ' || r.cpu_info as cpu_info, --12
r.address, --13
array(select ba.bug_id from bug_associations ba where ba.signature = r.signature) as bug_list, --14
r.user_comments, --15
r.uptime as uptime_seconds, --16
case when (r.email is NULL OR r.email='') then '' else r.email end as email, --17
(select sum(adu_count) from raw_adu adu
where adu.date = '2013-09-04'
and r.product = adu.product_name and r.version = adu.product_version
and substring(r.os_name from 1 for 3) = substring(adu.product_os_platform from 1 for 3)
and r.os_version LIKE '%'||adu.product_os_version||'%') as adu_count, --18
r.topmost_filenames, --19
case when (r.addons_checked is NULL) then '[unknown]'when (r.addons_checked) then 'checked' else 'not' end as addons_checked, --20
r.flash_version, --21
r.hangid, --22
r.reason, --23
r.process_type, --24
r.app_notes, --25
r.install_age, --26
rd.duplicate_of, --27
r.release_channel, --28
r.productid --29
from
reports r left join reports_duplicates rd on r.uuid = rd.uuid
where
'2013-09-03' <= r.date_processed and r.date_processed < '2013-09-04'
and r.product in ('Firefox','Fennec','FennecAndroid')
order by 5 -- r.date_processed, munged
2013-09-04 14:24:58,301 DEBUG - MainThread - killing database connections
2013-09-04 14:24:58,301 DEBUG - MainThread - connection MainThread closed
2013-09-04 14:24:58,352 INFO - done.
++ date -d '1 days ago' +%Y%m%d-crashdata.csv.gz
+ DATA_FILE=20130903-crashdata.csv.gz
+ scp /home/socorro/20130903-crashdata.csv.gz bacula@REDACTED:/data/security_group/crash_urls/
20130903-crashdata.csv.gz 100% 51MB 25.3MB/s 00:02
+ scp /home/socorro/20130903-crashdata.csv.gz mozauto@sisyphus.bughunter.ateam.phx1.mozilla.com:/work/mozilla/crash-reports/
20130903-crashdata.csv.gz 100% 51MB 50.6MB/s 00:01
+ ssh bacula@10.22.72.131 'chmod 640 /data/security_group/crash_urls/*'
+ ssh mozauto@sisyphus.bughunter.ateam.phx1.mozilla.com 'chmod 640 /work/mozilla/crash-reports/*'
+ mv /home/socorro/20130903-crashdata.csv.gz /tmp
++ date -d '1 days ago' +%Y%m%d-pub-crashdata.csv.gz
+ DATA_FILE=20130903-pub-crashdata.csv.gz
++ date -d '1 days ago' +%Y%m%d
+ SCRIPT_RUN_DATE=20130903
+ mkdir /mnt/crashanalysis/crash_analysis/20130903/
mkdir: cannot create directory `/mnt/crashanalysis/crash_analysis/20130903/': File exists
+ cp 20130903-pub-crashdata.csv.gz /mnt/crashanalysis/crash_analysis/20130903/
+ mv /home/socorro/20130903-pub-crashdata.csv.gz /tmp
+ rm -f /tmp/daily_urls.lock
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 8•12 years ago
|
||
I am going to kick off some backfill for this
Comment 9•12 years ago
|
||
All the backfill jobs have finished, I did 2013-08-24 - 2013-09-03, as I did 2013-09-04 earlier when testing
Please let us know if anything is missing
| Reporter | ||
Comment 10•12 years ago
|
||
looks good. I see the files on fs1 and sisyphus. thanks!
Status: RESOLVED → VERIFIED
| Assignee | ||
Updated•12 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•