Closed
Bug 661266
Opened 13 years ago
Closed 10 years ago
Socorro - care and maintenance of 'osdims' and 'productdims' tables
Categories
(Socorro :: General, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: lars, Unassigned)
References
Details
Currently, the 'osdims' and 'productdims' tables are populated by clients of the socorro.database.cachedIdAccess module's IdCache class. That means there is no central process responsible for populating these tables. The crons TCBS, TCBU and DailyUrl all use the IdCache class, so potentially all can be doing inserts into the 'osdims' and 'productdims' tables. I suggest that we consolidate this into one location so that we have a saner, one writer many reader module. That one location ought to be the processor. It examines every processed crash, so it has the opportunity to keep the 'osdims' and 'productdims' tables up to date with the stream of incoming crashes. I also suggest that the reports table be given extra fields for 'productdims_id' and 'osdims_id'. This would simplify the new TCBS and TCBU crons as they would no longer be responsible for normalizing os versions. I'm considering having the processor also do the 'urldims' table because parallelism is attractive. However, that task depends on the fate of TCBU
Reporter | ||
Comment 1•13 years ago
|
||
In the same theme as this bug, the 'osdims' table should be reduced to only a few entries. If the incoming os doesn't match anything ever seen before, it shouldn't just get automatically included in the table. there should be an 'other' or 'unknown' entry in the osdims table for these. What about new verisons of a known os? How do we distinguish those from spurious garbage?
Comment 2•13 years ago
|
||
Can we have known patterns that are considered valid, like "Windows NT x.x", "Mac OS X x.x", and insert new entries for new ones of those, with the mapping to human-readable versions (i.e. "Windows 7") added later?
Comment 3•13 years ago
|
||
After some discussion on IRC: (1) Productdims will become completely different under the new releasechannel model, so it's taken off this bug. (2) I disagree that the processors ought to handle this, for several reasons: (a) processors do row-at-a-time instead of batches (b) will add locking overhead to processors (c) will add extra processing time to processors (3) Adding IDs to the reports table ought to be part of a more general move to separate "raw" data from a fully normalized & cleaned fact table. In other words, we shouldn't be adding columns, we should be creating a new table. (4) I've added a new bug for a new osdims schema: 674065
Depends on: 674065
Assignee | ||
Updated•13 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
Reporter | ||
Comment 4•10 years ago
|
||
long ago resolved
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•