Places DB sample data set creation for testing

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
10 years ago
6 years ago

People

(Reporter: ddahl, Assigned: ddahl)

Tracking

({dev-doc-needed})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(4 attachments, 19 obsolete attachments)

10.11 KB, patch
Details | Diff | Splinter Review
65.51 KB, patch
anodelman
: review+
Details | Diff | Splinter Review
47.23 KB, application/octet-stream
anodelman
: review+
Details
45.50 KB, patch
bhearsum
: review+
ddahl
: feedback+
Details | Diff | Splinter Review
(Assignee)

Description

10 years ago
Create python scripts to generate Places DBs with various characteristics such as "many visits within the same domain", "visits across many domains", "many tags", "many bookmarks", etc. 

Create JS bookmarklet/console script to harvest statistics from Places db.
(Assignee)

Updated

10 years ago
Assignee: nobody → ddahl
(Assignee)

Comment 1

10 years ago
This is working on my machine, posting so adw can get it running as well. We may yets need to create a VirtualEnv for automated usage
Posted file Places stats generator (obsolete) —
First draft JS to generate stats on the user's Places database.  Copy and paste into your JS console.  Outputs stats to the console in JSON for easy parsing later.
Posted file Places stats generator (obsolete) —
With source code comments about each piece collected and dates output as nice strings.
Attachment #364399 - Attachment is obsolete: true
Posted file Places stats generator (obsolete) —
Now computes livemark container and livemark child counts, per Dietrich's email.
Attachment #364419 - Attachment is obsolete: true
nice, some comment on the receiving page
legend is vertical while results are horizontal, for every column could be available a sort of tooltip with a description and the query we run.
and column headers should be always visible also with many entries
Posted file Places stats generator (obsolete) —
temp tables handled correctly.
Attachment #364888 - Attachment is obsolete: true
Posted file Places stats generator (obsolete) —
Now with an alert on completion!
Attachment #364954 - Attachment is obsolete: true
Posted file Places stats generator (obsolete) —
Doesn't phone home anymore.  To be used with new site.
Attachment #364972 - Attachment is obsolete: true
Posted file Places stats generator (obsolete) —
Attachment #365299 - Attachment is obsolete: true
Posted file Places stats generator (obsolete) —
Attachment #365322 - Attachment is obsolete: true
(Assignee)

Comment 11

10 years ago
we should wrap it up in a ubiquity command
(Assignee)

Comment 12

10 years ago
WIP #2 added more information about the command in the "preview", updated the url to https://places-stats.mozilla.com
Attachment #365689 - Attachment is obsolete: true
I see that you're getting information about all the add-ons a user has installed, but you're not collecting which add-ons are actually enabled or not. I think you should probably get that data, too. Breakpad collects that, for instance.
(In reply to comment #13)
Yeah, two reasons for that:
1) I couldn't figure out how to do it. :\ If anyone knows, holla back.
2) People can disable an extension after using it for a long time.  I imagined we would come across a scenario like this:  We get some stats that bear the signature of a certain add-on.  The user(s) has that add-on installed but disabled.  So we totally ignore the disabled status.  Or, not even that.  We come across some stats with weird profiles.  Which extension is the cause?  Some are disabled, but maybe they were enabled last week.  So again we assume they were each enabled at one point and ignore disabled status.
(In reply to comment #14)
I'm in support of (2).  If it's installed, it had it's impact on the db at some point likely.
(Assignee)

Comment 17

10 years ago
Updated the python generator code - added glue to allow https fetch of JSON stats data live from the https://places-stats.mozilla.com/stats/ site
Attachment #364398 - Attachment is obsolete: true
(Assignee)

Comment 18

10 years ago
According to sdwilsh:

Add moz_keywords 
Add moz_inputhistory

to the data generator

Updated

10 years ago
Attachment #365326 - Attachment is obsolete: true
(Assignee)

Comment 19

10 years ago
added keywords support, next inputhistory
Attachment #366916 - Attachment is obsolete: true
(Assignee)

Comment 20

10 years ago
I plan on setting up my mac mini to run the generator nightly, perhaps pushing the resulting places.sqlite to the intranet.
Attachment #367118 - Attachment is obsolete: true
(Assignee)

Updated

10 years ago
Blocks: 489513
(Assignee)

Comment 21

10 years ago
Attachment #367263 - Attachment is obsolete: true
(Assignee)

Comment 22

10 years ago
Posted patch generator update (obsolete) — Splinter Review
Attachment #374173 - Attachment is obsolete: true
(Assignee)

Comment 23

10 years ago
Posted patch generator update (obsolete) — Splinter Review
generate.sh
Attachment #374525 - Attachment is obsolete: true
Assignee: ddahl → nobody
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
Hardware: x86 → All
(Assignee)

Comment 24

10 years ago
Need to add a date updater script to use daily to keep the data in the db "fresh"
IMHO advising users to paste code into the error console's command line isn't the best of ideas, more in bug 491243 comment 11.
(Assignee)

Updated

10 years ago
Depends on: 498820
(Assignee)

Comment 26

10 years ago
Cleaned up date handling and added ability to update all dates in the generated db via builddb/increment_dates.py - relies on existing env vars of the generate script. Not Python 2.6 compatible due to timeout bug in httplib2
Attachment #374526 - Attachment is obsolete: true
(Assignee)

Comment 27

10 years ago
Forgot to update the places schema to the current 3.5 version
Attachment #385436 - Attachment is obsolete: true
(Assignee)

Comment 28

10 years ago
The generated places dbs should be places_generated_max.sqlite, places_generated_avg.sqlite, places_generated_min.sqlite

The command line args look like this 

python builddb/generator.py -i avg

python builddb/generator.py -i min

python builddb/generator.py -i max
(Assignee)

Comment 29

10 years ago
average generation failed due to floating point number to int type coersion.
Attachment #385483 - Attachment is obsolete: true
(Assignee)

Comment 30

10 years ago
Now we know when the generator/date increment script was last run - so the date_increment.py will calculate how many days to roll the dates forward.
Attachment #385506 - Attachment is obsolete: true
Attachment #387327 - Flags: review?(anodelman)
(Assignee)

Comment 31

10 years ago
Zipped it for cvs users:)
Attachment #387328 - Flags: review?(anodelman)
(Assignee)

Comment 32

10 years ago
I wonder if writing the queries to a file, and reading that file as a single transaction would really speed up the initial creation step? Would the overhead of writing to the file then reading back the file and executing all of the inserts be slower than the current implementation via django's ORM?
adding dev-doc-needed, please create a wiki page to explain how to use it and which prerequisites you need, i spent 1 hour to discover all prerequisites needed on ubuntu (django, httpLib2, simpleJson python packages are needed, plus a bunch of env vars). It is finally working, but was not straight-forward as i was expecting :) also maybe a script to setup the env variables would help if this is goingto be a general purpose db generator.
Keywords: dev-doc-needed
can this be closed?
This still needs to be landed somewhere, as far as I know?
Assignee: nobody → anodelman
Attachment #387327 - Flags: review?(anodelman) → review+
Attachment #387328 - Flags: review?(anodelman) → review+
Assignee: anodelman → nobody
(Assignee)

Updated

9 years ago
Assignee: nobody → ddahl
(Assignee)

Comment 36

9 years ago
alice over irc said it could be checked into the talos repo, if it needs any tweaks I can look it over first.
Attachment #453234 - Flags: feedback?(ddahl)
Comment on attachment 453234 [details] [diff] [review]
[checked in]add generator code to buildfarm/utils

This looks like it's already been used a few times. Don't think it makes sense for me to review it. Welcome to stampy town.
Attachment #453234 - Flags: review?(bhearsum) → review+
(Assignee)

Comment 39

9 years ago
Comment on attachment 453234 [details] [diff] [review]
[checked in]add generator code to buildfarm/utils

Looks good to me. I assume this is exactly the code you are currently using in production?
Attachment #453234 - Flags: feedback?(ddahl) → feedback+
Comment on attachment 453234 [details] [diff] [review]
[checked in]add generator code to buildfarm/utils

changeset:   650:455404252ab3
Attachment #453234 - Attachment description: add generator code to buildfarm/utils → [checked in]add generator code to buildfarm/utils
(Assignee)

Updated

6 years ago
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.