Closed
Bug 1122260
Opened 11 years ago
Closed 10 years ago
Add processor rule to make MD5 sum of dumps
Categories
(Socorro :: Backend, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: lars, Assigned: lars)
Details
add and deploy a new processor rule to make an md5sum for the dumps and save that info in the raw_crash.
| Assignee | ||
Comment 1•11 years ago
|
||
here's the complication - the processors don't actually have the dumps in memory. Since a dump's primary use is as food for the stackwalker, the dumps get pushed to disk immediately on loading. They're gone from memory before the crash processing algorithm even starts.
There are several ways forward:
1) go back to thinking about putting the code into the collector - with the caveat that it will slow down the collector.
2) make a processing rule that invokes the command line version of MD5SUM to use the dumps on the disk
3) make a processing rule that reads the dumps from disk and calculates the MD5 in memory
4) rework the crashstorage API such that fetching the dumps automatically calculates the md5 as it copies the dumps from external storage into the filesystem for stackwalker. This method has the advantage for S3 of being able to fetch the hash from S3 for that storage scheme. The other storage schemes would fallback to calucalating it while it is in memory for that brief moment during the write to disk.
I favor method 4, as for us, it would cost no overhead when using S3. Our other storage methods would experience a minor cost.
Comment 2•11 years ago
|
||
(In reply to K Lars Lohn [:lars] [:klohn] from comment #1)
> 4) rework the crashstorage API such that fetching the dumps automatically
> calculates the md5 as it copies the dumps from external storage into the
> filesystem for stackwalker. This method has the advantage for S3 of being
> able to fetch the hash from S3 for that storage scheme. The other storage
> schemes would fallback to calucalating it while it is in memory for that
> brief moment during the write to disk.
>
> I favor method 4, as for us, it would cost no overhead when using S3. Our
> other storage methods would experience a minor cost.
#4 sounds good to me.
| Assignee | ||
Comment 3•10 years ago
|
||
the collector now creates a hash of the dumps - there is a key in the raw_crash called "dump_checksums" - it contains a mapping of dump name to checksum.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•