Closed Bug 415502 Opened 17 years ago Closed 6 years ago

Replace text .dat files with sqlite db

Categories

(Webtools Graveyard :: Tinderbox, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: cls, Assigned: cls)

References

Details

Attachments

(1 file, 1 obsolete file)

On several occassions, we've discussed replacing the text db files that contain various tinderbox state info with a sqlite db. The size savings of using a relational db and the speed savings of not needing to traverse an entire dataset line by line should be immediately obvious.
Attached patch v0.3 (obsolete) — Splinter Review
My plan is to modify checksetup.pl to convert the existing .dat files to sqlite .dbs. This way we can do the conversion once and not have to worry about making the code understand both formats. We will continue to use individual databases instead of combining them. The reasoning being that, according to their documentation, sqlite locks the entire database to perform certain write operations and I don't want to worry about blocking certain operations unnecessarily because a deluge of backlogged emails came in. The schema looks like: * build.db status (id INTEGER PRIMARY KEY AUTOINCREMENT, status TEXT NOT NULL); parsers (id INTEGER PRIMARY KEY AUTOINCREMENT, parser TEXT NOT NULL); names (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT NOT NULL, parserId INTEGER NOT NULL); builds (nameId INTEGER NOT NULL, statusId INTEGER NOT NULL, endtime INTEGER NOT NULL, starttime INTEGER NOT NULL, logfile TEXT, binaryurl TEXT); * who.db users (id INTEGER PRIMARY KEY AUTOINCREMENT, username TEXT NOT NULL) commits (userId INTEGER NOT NULL, timestamp INTEGER NOT NULL) * scrape.db logs (id INTEGER PRIMARY KEY AUTOINCREMENT, logfile TEXT NOT NULL, timestamp INTEGER NOT NULL) info (logId INTEGER NOT NULL, data TEXT NOT NULL); * warnings.db warnings (id INTEGER PRIMARY KEY AUTOINCREMENT, logfile TEXT NOT NULL, count INTEGER NOT NULL, timestamp INTEGER NOT NULL); The attached patch handles the conversions for the last 3 dbs. I haven't tackled the build.db yet. I'm not sure how sqlite stores its data, but the .dbs so far are larger than their .dat equivalents. -rw-rw-r-- 1 tinderbox tinderbox 1808303 Feb 6 12:53 trees/cls/scrape.dat -rw-r--r-- 1 root root 2701312 Feb 4 19:17 trees/cls/scrape.db -rw-rw-r-- 1 tinderbox tinderbox 12460 Feb 6 13:01 trees/cls/who.dat -rw-r--r-- 1 root root 15360 Feb 4 19:17 trees/cls/who.db For my local setup, the scrape data contains a lot of redundant entries as we print out the SVN revision which is the same for almost all builds at any one time. For us, it may be optimal to split data into a separate table but I'm not so sure about the general case. The problem is that, at some point, you need to either do refcounting or scan the entire data table to figure out which entries are no longer needed when you want to clean things up. Actually, each of the databases has this problem (except warnings). When trim_logs() is called, it'll be trivial to do remove old entries from the builds, commits & logs tables but we'll need to do a bit more to clean out obsolete entries from the users, info & names tables. Not sure how I'd want to handle that at this point.
Attached patch v0.5Splinter Review
I implemented the conversion for the build table and I finally saw the size savings I was expecting: -rw-rw-r-- 1 tinderbox tinderbox 8022450 Feb 11 17:16 trees/cls/build.dat -rw-r--r-- 1 root root 5180416 Feb 11 16:51 trees/cls/build.db For the scrape table, I went ahead and optimized the layout for our subversion setup and that gave me a 200k savings on a 2.5M file. However, it's still larger than the 1.9M source .dat . At this point, I suspect that the size issues are due to the implementation of the TEXT field in sqlite but I do not have any hard evidence. It just reminds of the old m*sql days with static field lengths. Anyway, I'm going to ignore the problem for how and try to avoid doing something silly like storing the individual name components of the log as integers in the logs table instead the whole text field.
Attachment #301736 - Attachment is obsolete: true
Severity: normal → enhancement
Product: Webtools → Webtools Graveyard
Tinderbox isn't maintained anymore. Closing.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: