Closed Bug 498820 Opened 11 years ago Closed 11 years ago
Places DB sample data set: update script
Update scri[pt that advances the day by 1 for each time stamp in the places.sqlite database - this is for automated tests.
Right now, places-stats.mozilla.com is down. I filed a bug 500561 This patch is the same as the places generator patch on bug 480340 including the date updater script.
The patch just filed really only needs some final testing when the places-stats site is back up and running.
Same patch as bug 480340 - I fixed all date handling to use microseconds, also posting to bug 480340. you can update the dates in an existing generated places.sqlite running: python2.5 builddb/increment_dates.py Do not use Python 2.6 - httplib2 connection timeout issues that need fixing by the author of httplib2
(02:11:35 PM) alice: so, more questions about the date updater thing (02:11:39 PM) ddahl: ok (02:11:47 PM) alice: it simply increases everything by one day? (02:12:07 PM) ddahl: yes (02:12:17 PM) ddahl: all dates are microseconds (02:12:27 PM) alice: so we'd have to guarantee that we'd run that every night to ensure that we'd keep things up to date (02:12:36 PM) ddahl: hmmm (02:12:37 PM) ddahl: yeah (02:12:44 PM) alice: which is a little tough on the release eng side of the world - what with machines going down and so on (02:12:51 PM) ddahl: that is what i thoguht we were doing, are we doing something more clever? (02:12:57 PM) ddahl: ok (02:13:09 PM) alice: yeah, i thought that it would be okay as well and then i considered the issue deeper (02:13:17 PM) alice: sometime things get shut down/delayed/etc (02:13:23 PM) ddahl: indeed (02:13:46 PM) alice: it would be better if the script could ensure that things were dated from today - except that everything is random and we don't want it all on the same date (02:14:29 PM) ddahl: i wonder if we can keep a bit of configuration in the database itself, so that we know the last time it was run and therefor can go forward x number of days? (02:14:47 PM) alice: that would be awesome (02:15:17 PM) alice: it would be great to be able to run the script and say that even if it hasn't been run in a week, or has already been run today the profile is ready (02:15:26 PM) ddahl: that should be an easy fix - we consult a place for its datetime and subtract from today to see how much to advance dates (02:15:51 PM) ddahl: perhaps then we can just update it monthly or whenever you get around to it:) (02:16:10 PM) ddahl: daily is a bit much! (02:16:15 PM) alice: indeed (02:16:19 PM) alice: thanks for the patience (02:16:24 PM) ddahl: no worries (02:16:28 PM) alice: should have thought it through before asking for changes (02:16:36 PM) ddahl: i'll update the bug and start tweaking it (02:16:42 PM) ddahl: not a problem (02:17:12 PM) alice: thanks ok, need to tweak this updater based on this chat...
Per phone call with Alice yesterday, there are basically 3 different "dirty profile" suites being tracked here: small, medium, large. The large suite is blocked, until the script in bug#498820 can handle this. Meanwhile, we're going ahead with small and medium suites, as they are working fine.
It is possible our large dataset is unrealistically large, hence the time it takes to update the places.sqlite file. I can also try another approach to break this up into fewer transactions.
Can confirm that places needs trimming, severe stalling was cured by deleting places.sqlite and places.sqlite-journal
(In reply to comment #7) > Can confirm that places needs trimming, severe stalling was cured by deleting > places.sqlite and places.sqlite-journal Sounds like another Places bug, or perhaps your places.sqlite is corrupt. this bug is about a meta tool for testing places.sqlite during build
Yes, updating the place_generated_max places.sqlite took longer than 5 hours and never completed (by that point I killed the update). For now I've disabled max dirty profile testing. We can either: - fix the update script so that it runs in a reasonable time frame - provide me with different settings to run the profile generation tool with to create a smaller, but still 'max' sized places.sqlite for testing with
I don't know python or the framework being used here very well, but does increment_dates wrap it's work in a transaction? You'll see massive speedups if it isn't and we start doing that.
I think it wraps each update in a transaction. I never thought this would require speed, in which case I will probably have to chunk a certain number of updates, run a transaction and iterate. How many date column update queries should I chunk together? Perhaps SQLite has a function that will roll each date in a table forward x number of microseconds? That's all I am doing here.
Well, if it's just one table, you can accomplish this in one query: UPDATE table SET date_col1 = date_col1 + x, date_col2 = date_col2 + x, etc
it is several tables, but this will simplify it by a couple of orders of magnitude. I am dumb and stuff.
Having some issues with the SQL to update the table in one query. This does not appear to work: UPDATE moz_bookmarks SET dateadded = dateadded + 327000000 WHERE dateadded < 1250815574000000 The value for dateadded becomes 327000000. I know there is no real "casting" in SQL (espicially sqlite) to deal with in this case.
Saving my work.
That seems odd: sqlite> create table foo (number INTEGER); sqlite> insert into foo values (5); sqlite> select * from foo; 5 sqlite> update foo set number = number + 5; sqlite> select * from foo; 10
This query: UPDATE moz_bookmarks SET dateadded = dateadded + 180000000 WHERE dateadded < 1250872302000000 AND dateadded != '' via sqlite3 works great via Python/pysqlite it sets the column to 180000000 instead of adding the values together. I will file a bug with the pysqlite folks
I'm leaning toward a problem with the way django does replacements in queries for this bug: from pysqlite2 import dbapi2 as Database c = Database.connect("/home/ddahl/code/python/places/places.sqlite") cursor = c.cursor() r = cursor.execute("update moz_bookmarks set dateadded = dateadded + 23") That works. Django: from django.db import connection cursor = connection.cursor() rows = cursor.execute("update moz_bookmarks set dateadded = dateadded + 23") rows = cursor.execute("select dateadded from moz_bookmarks") rows.fetchall() This works too. Looks like the django placeholders cannot deal with substitution of column names - I should have known this. I am just creating the queries in their entirety, and all is well.
Need to find a place for this to really live in the tree, will also update bug 480340
Alice: This will run about 8 queries instead of thousands:)
Testing new updater script on talos-staging. Will try out max_profile size there for a while before we roll out across the board.
can this be closed?
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h". In Thunderbird 3.0b, you do that as follows: Tools | Message Filters Make sure the correct account is selected. Click "New" Conditions: Body contains places-to-b-and-h Change the action to "Delete Message". Select "Manually Run" from the dropdown at the top. Click OK. Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter. Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
Attachment #395904 - Flags: review?(anodelman) → review+
Has this been landed?
no not landed. we probably should land it. I wonder who would be the best reviewer as it is almost entirely python and requires external python libraries...
You need to log in before you can comment on or make changes to this bug.