Closed
Bug 952101
Opened 11 years ago
Closed 11 years ago
Version 2 of Bugzilla-ETL
Categories
(Testing :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: ekyle, Assigned: ekyle)
References
()
Details
Attachments
(1 file)
1.05 MB,
patch
|
mcote
:
review+
|
Details | Diff | Splinter Review |
Fix missing private bugs in private cluster -
https://github.com/klahnakoski/Bugzilla-ETL/commit/e88f813b6ed027a0f9ed29c43bae9df4d391b70e?w=1#diff-1bf47e2bbc2d093719a4eacfa1645ef4R225
https://github.com/klahnakoski/Bugzilla-ETL/commit/e88f813b6ed027a0f9ed29c43bae9df4d391b70e?w=1#diff-1bf47e2bbc2d093719a4eacfa1645ef4R298
Test
https://github.com/klahnakoski/Bugzilla-ETL/commit/e88f813b6ed027a0f9ed29c43bae9df4d391b70e?w=1#diff-469bf8513f3c4ac87f66514572ce704fR301
Fix tracking flags missed during incremental ETL
https://github.com/klahnakoski/Bugzilla-ETL/commit/e88f813b6ed027a0f9ed29c43bae9df4d391b70e?w=1#diff-e7f5c575eeda495b16d7432bbe35b033R400
Test
https://github.com/klahnakoski/Bugzilla-ETL/commit/e88f813b6ed027a0f9ed29c43bae9df4d391b70e?w=1#diff-469bf8513f3c4ac87f66514572ce704fR363
Fix json encoding of datetime.date objects
https://github.com/klahnakoski/Bugzilla-ETL/commit/e88f813b6ed027a0f9ed29c43bae9df4d391b70e?w=1#diff-388b6b027188a75b8fe851ad96a9d01cR111
Test
https://github.com/klahnakoski/pyLibrary/blob/master/tests/test_json.py#L20
Use unittest.TestCase (generally explains the whitespace issues in this file)
https://github.com/klahnakoski/Bugzilla-ETL/commit/e88f813b6ed027a0f9ed29c43bae9df4d391b70e?w=1#diff-469bf8513f3c4ac87f66514572ce704fL9
All changes here
https://github.com/klahnakoski/Bugzilla-ETL/commit/e88f813b6ed027a0f9ed29c43bae9df4d391b70e?w=1
Assignee | ||
Comment 2•11 years ago
|
||
Only single instance of bzETL to run
https://github.com/klahnakoski/Bugzilla-ETL/commit/e88f813b6ed027a0f9ed29c43bae9df4d391b70e?w=1#diff-83d3dd242193bcf3ebde0f1de4f03823R85
Assignee | ||
Updated•11 years ago
|
Comment 3•11 years ago
|
||
Comment on attachment 8350054 [details] [diff] [review]
Please review
Review of attachment 8350054 [details] [diff] [review]:
-----------------------------------------------------------------
I can't claim to fully understand the tests, but it all looks decent. :) Little nits:
::: MANIFEST.in
@@ +15,5 @@
> +include README.txt
> +exclude MANIFEST.in
> +
> +
> +
Remove blank lines.
::: bzETL/extract_bugzilla.py
@@ +168,3 @@
>
> try:
> + comments = db.query("""
Why did you make some of these unicode strings but not all?
::: bzETL/util/cnv.py
@@ +36,5 @@
> @staticmethod
> def JSON2object(json_string, params=None, flexible=False):
> try:
> #REMOVE """COMMENTS""", #COMMENTS, //COMMENTS, AND \n \r
> + if flexible: json_string = re.sub(r"\"\"\".*?\"\"\"|\s+//.*\n|#.*?\n|\n|\r", r" ",
May as well move the clause down to its own line.
@@ +231,5 @@
> + return _filter(esfilter, row, rownum, rows)
> + return output
> +
> +def _filter(esfilter, row, rownum, rows):
> + esfilter=struct.wrap(esfilter)
Since you've been fixing these...add spaces around the =.
@@ +273,5 @@
> + field = esfilter.missing
> + else:
> + field = esfilter.missing.field
> +
> + if row[field] == None:
"is None" is the general convention.
@@ +283,5 @@
> + field = esfilter.missing
> + else:
> + field = esfilter.missing.field
> +
> + if row[field] != None:
"is not None" would be the convention here.
::: bzETL/util/db.py
@@ +210,2 @@
>
> + if param: sql = expand_template(sql, self.quote_param(param))
Might as well split this into two lines.
@@ +212,5 @@
> sql = self.preamble + outdent(sql)
> if self.debug:
> Log.note(u"Execute SQL:\n{{sql}}", {u"sql": indent(sql)})
>
> + # if isinstance(sql, unicode):
Generally it's bad to leave commented-out code, in part because it can bitrot and you'll never notice until you finally want to use it. If you really want to leave this in, put in a comment as to why it's commented out.
@@ +314,5 @@
> stderr=subprocess.STDOUT,
> bufsize=-1
> )
> + if isinstance(sql, unicode):
> + sql=sql.encode("utf-8")
Spacing.
::: bzETL/util/files.py
@@ +20,4 @@
>
>
> class File(object):
> + def __init__(self, filename, buffering=2 ** 14):
I believe the standard for the power operator is no spaces, e.g. 2**14.
::: bzETL/util/jsons.py
@@ +9,1 @@
> #
Random comment: might be good to explain why this file is preferable to the standard json module.
@@ +148,2 @@
> ESCAPE_DCT = {
> + u"\\": u"\\\\",
Any reason you switched all the single quotes to double quotes here?
::: bzETL/util/maths.py
@@ +113,5 @@
> return output
>
> + @staticmethod
> + def ceiling(value):
> + return math.ceil(value)
Why bother with this? Why not use math.ceil() directly where this is being called? Just to keep all the math-related stuff you use in this file?
::: bzETL/util/queries/Q.py
@@ +544,5 @@
>
> def __getitem__(self, key):
> try:
> + if isinstance(key, dict):
> + key=struct.unwrap(key)
Spacing.
@@ +547,5 @@
> + if isinstance(key, dict):
> + key=struct.unwrap(key)
> + key = [key.get(k, None) for k in self._keys]
> + elif not isinstance(key, list):
> + key=[key]
Spacing.
@@ +567,5 @@
>
>
> def add(self, val):
> + if not isinstance(val, dict):
> + val = {(self._keys[0], val)}
Just noticed this... why the init via tuple and not the standard colon notation?
::: bzETL/util/queries/windows.py
@@ +174,5 @@
>
> def end(self):
> return self.total
> +
> +
Remove extra blank lines.
::: bzETL/util/startup.py
@@ +10,5 @@
>
> import argparse
> +import os
> +import tempfile
> +import sys
Sorting.
@@ +95,5 @@
> +
> + Remember that this works by creating a lock file with a filename based on the full path to the script file.
> + """
> + def __init__(self, flavor_id=""):
> + import sys
Why in the constructor?
::: tests/resources/python/look_for_leaks.py
@@ +10,5 @@
> +
> + # max_bug_id = private.query({
> + # "from":"private_bugs",
> + # "select":{"value":"bug_id","aggregate":"maximum"}
> + # })
Remove or add comment (and elsewhere below).
@@ +15,5 @@
> +
> + max_bug_id = private.search({
> + "query":{"filtered":{
> + "query":{"match_all":{}},
> + "filter":{"and":[{"match_all":{}}]}
Horizontal spacing (and below).
@@ +57,5 @@
> + })
> +
> + private_ids=set(Q.select(results.hits.hits, "fields.bug_id"))
> +
> +
Spacing (both vertical and horizontal).
@@ +91,5 @@
> + #VERIFY NONE IN PUBLIC
> +
> + #FIND ALL PRIVATE COMMENTS
> +
> + #VERIFY NONE IN PUBLIC
What are these comments about?
@@ +96,5 @@
> +
> +
> +
> +
> +if __name__=="__main__":
Two blank lines would suffice. :) Also, horizontal spacing.
@@ +101,5 @@
> + try:
> + settings = startup.read_settings()
> + Log.start(settings.debug)
> + private = ElasticSearch(settings.private)
> + public = ElasticSearch(settings.public)
Spacing.
::: tests/test_etl.py
@@ +9,4 @@
> #
>
> from datetime import datetime
> +import unittest
Sorting. And technically "from" import should go below direct imports.
@@ +108,5 @@
> + param.end_time = CNV.datetime2milli(get_current_time(db))
> + param.start_time = 0
> + param.start_time_str = extract_bugzilla.milli2string(db, 0)
> + param.alias_file = self.settings.param.alias_file
> + # param.bugs_filter=SQL("bug_id in {{bugs}}", {"bugs":db.quote_value(some_bugs)})
Remove or add comment as to why this is commented out. Also, weird spacing.
@@ +125,5 @@
> + except Exception, e:
> + Log.warning("Total failure during compare of bugs {{bugs}}", {"bugs": some_bugs}, e)
> +
> +
> + def test_private_etl(self):
Don't double-space method definitions within a class.
@@ +145,5 @@
> +
> + def test_public_etl(self):
> + """
> +
> + """
Empty docstring...
@@ +475,3 @@
> ref_versions = \
> Q.sort(
> + map(# map-lambda ADDED TO FIC OLD PRODUCTION BUG VERSIONS
I know this is pre-existing code, but why is this not a list comprehension? If this comment explains it, I don't quite understand. :)
::: tests/util/database.py
@@ +115,5 @@
> db.insert("bugs_activity", activity)
>
> + db.execute("UPDATE bugs SET delta_ts={{now}} WHERE {{where}}", {
> + "now":now,
> + "where":db.esfilter2sqlwhere({"term":{"bug_id":old_record.bug_id}})
Spacing.
::: tests/util/elasticsearch.py
@@ +41,4 @@
>
>
> def search(self, query):
> + f=CNV.esfilter2where(struct.wrap(query).query.filtered.filter)
Spacing.
Attachment #8350054 -
Flags: review?(mcote) → review+
Assignee | ||
Comment 4•11 years ago
|
||
Fixes (no code changes, all tests pass):
https://github.com/klahnakoski/Bugzilla-ETL/commit/cccb89f3ab27c4abc1f61baa2199aa1a781e71b0
> "is None" is the general convention.
Yes, but I have the NullStruct floating around my code. Using x == None does a more complex comparison (https://github.com/klahnakoski/Bugzilla-ETL/blob/master/bzETL/util/struct.py#L228)
> Random comment: might be good to explain why this file is preferable to the standard json module.
Thank you. Added comment (https://github.com/klahnakoski/Bugzilla-ETL/commit/cccb89f3ab27c4abc1f61baa2199aa1a781e71b0#diff-388b6b027188a75b8fe851ad96a9d01cR11)
> @@ +148,2 @@
> > ESCAPE_DCT = {
> > + u"\\": u"\\\\",
>
> Any reason you switched all the single quotes to double quotes here?
Only for the sake of uniformity with other lines. I had to double-take on the backslash and whether I got it correct.
> ::: bzETL/util/maths.py
> @@ +113,5 @@
> > return output
> >
> > + @staticmethod
> > + def ceiling(value):
> > + return math.ceil(value)
>
> Why bother with this? Why not use math.ceil() directly where this is being called?
One of my many weaknesses is a terrible memory for the multitude of short forms. In this case I keep forgetting that the "ceiling" function is missing "ing". After having to look this up twice, and even wondering if Python had such a function, I added a full name version here so I do not need to remember this again. Furthermore, one day, I will add "mod" parameter so I can perform ceiling() modulo some base value.
> @@ +475,3 @@
> > ref_versions = \
> > Q.sort(
> > + map(# map-lambda ADDED TO FIC OLD PRODUCTION BUG VERSIONS
>
> I know this is pre-existing code, but why is this not a list comprehension? If this comment
> explains it, I don't quite understand. :)
Thank you! List comprehension should have been used. I must have railroaded myself into using map() for some reason I long forgot. Changed is in a future version.
> > +
> > + Remember that this works by creating a lock file with a filename based on the full path to
> the script file.
> > + """
> > + def __init__(self, flavor_id=""):
> > + import sys
>
> Why in the constructor?
Good question! Will have it fixed in another version.
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → klahnakoski
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•