674415 - Replace reports with a real fact table

Assignee

Description

•

14 years ago

K Lars Lohn [:lars] [:klohn]

Comment 1

•

14 years ago

I certainly like this idea. I would further like to collapse the product/version/build/channel into a single column that refers to a dimension table - the same one that we'll be using for the new materialized views.

[:jberkus] Josh Berkus

Assignee

Comment 2

•

14 years ago

Lars, Yeah, I was thinking of that. That partly depends on performance tests; we do a LOT of rollups by OS, version, and build, so it might actually be better to have all three references in the table even thought they're redundant. And actually, it occurs to me that there's no reason we can't build the new_reports table alongside the old and gradually move functionality over to it. I'd need to spend a big chunk of time with you inderstanding the various columns ... and the sub-tables of reports ... but then we could have a phased approach.

[:jberkus] Josh Berkus

Assignee

Comment 3

•

14 years ago

Per IRC discussion, I'm going to suggest bumping up the schedule for this feature. In retrospect, doing 2.2 would have been both easier and faster for all staff if I'd done "reports-normalized" in the first place instead of trying to woork around not doing it.

[:jberkus] Josh Berkus

Assignee

Comment 4

•

14 years ago

Lars and I chatted about this. There are some issues which make doing this more complex: Advanced Search currently offers the ability to get reports up to the current minute. This makes updating reports_normalized by batch impractical, unless we introduce a time-delay which users are liable to be unhappy with. This means that the normalization needs to happen in the processors, or via a batch job which runs every minute. If we do it in the processors, then we need to write middleware or SP code which will deal with conflicting concurrent updates to dimension tables without blocking the processors. Such concurrency-sensitive code will be more complex than the originally anticipated batch jobs. However, the list of products will always be time-lagged, because it depends on FTP scraping or other asynchronous population. This means that we would need to add some reports as being "Unknown Build" to the normalized table, and then update them later (by batch job) to be the correct release versions. The reasons to do this are: A) Fix current broken search issues in the Socorro UI for rapid release betas, and anticipated issues with nightlies and aurora. B) Better eventual integration with ES-based search if we drop PG search, by allowing us to keep only metadata for reports. C) Eventually shrinking of the database (by as much as 50%) once we can get rid of the raw reports. D) Better performance on search, exports, and batch jobs.

[:jberkus] Josh Berkus

Assignee

Comment 5

•

14 years ago

This is in progress with the new reports_clean in 2.3.2. Unresolved is deciding how to handle making reports_clean up-to-the-minute for the devs. In 2.3.2, reports_clean get updated hourly and runs a few hours behind. This means that it's not current enough.

Nobody; OK to take it and work on it

Updated

•

14 years ago

Component: Socorro → General

Product: Webtools → Socorro

[:jberkus] Josh Berkus

Assignee

Comment 6

•

13 years ago

This is on track but slower than expected.

Target Milestone: 2.5 → 2.7

[:jberkus] Josh Berkus

Assignee

Comment 7

•

13 years ago

Currently waiting on the new search code.

Target Milestone: 5 → 8

[:jberkus] Josh Berkus

Assignee

Comment 8

•

13 years ago

Also needs to wait on ripping out OldTCBS.

[:jberkus] Josh Berkus

Assignee

Updated

•

13 years ago

Target Milestone: 8 → 11

[:jberkus] Josh Berkus

Assignee

Updated

•

13 years ago

Target Milestone: 11 → Future

[:jberkus] Josh Berkus

Assignee

Updated

•

13 years ago

Target Milestone: Future → 18

[:jberkus] Josh Berkus

Assignee

Comment 9

•

13 years ago

Mobeta is merged now,marking this done. [qa-]

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Whiteboard: [qa-]

Bugzilla

Replace reports with a real fact table

Categories

(Socorro :: General, task)

Tracking

(Not tracked)

People

(Reporter: jberkus, Assigned: jberkus)

References

Details

(Whiteboard: [qa-])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Updated

Updated

Updated

Comment 9