Closed Bug 952101 Opened 11 years ago Closed 11 years ago

Version 2 of Bugzilla-ETL

Categories

(Testing :: General, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ekyle, Assigned: ekyle)

References

()

Details

Attachments

(1 file)

Attached patch Please reviewSplinter Review
See main bug comment
Attachment #8350054 - Flags: review?(mcote)
Depends on: 952108
Blocks: 952108
No longer depends on: 952108
Comment on attachment 8350054 [details] [diff] [review] Please review Review of attachment 8350054 [details] [diff] [review]: ----------------------------------------------------------------- I can't claim to fully understand the tests, but it all looks decent. :) Little nits: ::: MANIFEST.in @@ +15,5 @@ > +include README.txt > +exclude MANIFEST.in > + > + > + Remove blank lines. ::: bzETL/extract_bugzilla.py @@ +168,3 @@ > > try: > + comments = db.query(""" Why did you make some of these unicode strings but not all? ::: bzETL/util/cnv.py @@ +36,5 @@ > @staticmethod > def JSON2object(json_string, params=None, flexible=False): > try: > #REMOVE """COMMENTS""", #COMMENTS, //COMMENTS, AND \n \r > + if flexible: json_string = re.sub(r"\"\"\".*?\"\"\"|\s+//.*\n|#.*?\n|\n|\r", r" ", May as well move the clause down to its own line. @@ +231,5 @@ > + return _filter(esfilter, row, rownum, rows) > + return output > + > +def _filter(esfilter, row, rownum, rows): > + esfilter=struct.wrap(esfilter) Since you've been fixing these...add spaces around the =. @@ +273,5 @@ > + field = esfilter.missing > + else: > + field = esfilter.missing.field > + > + if row[field] == None: "is None" is the general convention. @@ +283,5 @@ > + field = esfilter.missing > + else: > + field = esfilter.missing.field > + > + if row[field] != None: "is not None" would be the convention here. ::: bzETL/util/db.py @@ +210,2 @@ > > + if param: sql = expand_template(sql, self.quote_param(param)) Might as well split this into two lines. @@ +212,5 @@ > sql = self.preamble + outdent(sql) > if self.debug: > Log.note(u"Execute SQL:\n{{sql}}", {u"sql": indent(sql)}) > > + # if isinstance(sql, unicode): Generally it's bad to leave commented-out code, in part because it can bitrot and you'll never notice until you finally want to use it. If you really want to leave this in, put in a comment as to why it's commented out. @@ +314,5 @@ > stderr=subprocess.STDOUT, > bufsize=-1 > ) > + if isinstance(sql, unicode): > + sql=sql.encode("utf-8") Spacing. ::: bzETL/util/files.py @@ +20,4 @@ > > > class File(object): > + def __init__(self, filename, buffering=2 ** 14): I believe the standard for the power operator is no spaces, e.g. 2**14. ::: bzETL/util/jsons.py @@ +9,1 @@ > # Random comment: might be good to explain why this file is preferable to the standard json module. @@ +148,2 @@ > ESCAPE_DCT = { > + u"\\": u"\\\\", Any reason you switched all the single quotes to double quotes here? ::: bzETL/util/maths.py @@ +113,5 @@ > return output > > + @staticmethod > + def ceiling(value): > + return math.ceil(value) Why bother with this? Why not use math.ceil() directly where this is being called? Just to keep all the math-related stuff you use in this file? ::: bzETL/util/queries/Q.py @@ +544,5 @@ > > def __getitem__(self, key): > try: > + if isinstance(key, dict): > + key=struct.unwrap(key) Spacing. @@ +547,5 @@ > + if isinstance(key, dict): > + key=struct.unwrap(key) > + key = [key.get(k, None) for k in self._keys] > + elif not isinstance(key, list): > + key=[key] Spacing. @@ +567,5 @@ > > > def add(self, val): > + if not isinstance(val, dict): > + val = {(self._keys[0], val)} Just noticed this... why the init via tuple and not the standard colon notation? ::: bzETL/util/queries/windows.py @@ +174,5 @@ > > def end(self): > return self.total > + > + Remove extra blank lines. ::: bzETL/util/startup.py @@ +10,5 @@ > > import argparse > +import os > +import tempfile > +import sys Sorting. @@ +95,5 @@ > + > + Remember that this works by creating a lock file with a filename based on the full path to the script file. > + """ > + def __init__(self, flavor_id=""): > + import sys Why in the constructor? ::: tests/resources/python/look_for_leaks.py @@ +10,5 @@ > + > + # max_bug_id = private.query({ > + # "from":"private_bugs", > + # "select":{"value":"bug_id","aggregate":"maximum"} > + # }) Remove or add comment (and elsewhere below). @@ +15,5 @@ > + > + max_bug_id = private.search({ > + "query":{"filtered":{ > + "query":{"match_all":{}}, > + "filter":{"and":[{"match_all":{}}]} Horizontal spacing (and below). @@ +57,5 @@ > + }) > + > + private_ids=set(Q.select(results.hits.hits, "fields.bug_id")) > + > + Spacing (both vertical and horizontal). @@ +91,5 @@ > + #VERIFY NONE IN PUBLIC > + > + #FIND ALL PRIVATE COMMENTS > + > + #VERIFY NONE IN PUBLIC What are these comments about? @@ +96,5 @@ > + > + > + > + > +if __name__=="__main__": Two blank lines would suffice. :) Also, horizontal spacing. @@ +101,5 @@ > + try: > + settings = startup.read_settings() > + Log.start(settings.debug) > + private = ElasticSearch(settings.private) > + public = ElasticSearch(settings.public) Spacing. ::: tests/test_etl.py @@ +9,4 @@ > # > > from datetime import datetime > +import unittest Sorting. And technically "from" import should go below direct imports. @@ +108,5 @@ > + param.end_time = CNV.datetime2milli(get_current_time(db)) > + param.start_time = 0 > + param.start_time_str = extract_bugzilla.milli2string(db, 0) > + param.alias_file = self.settings.param.alias_file > + # param.bugs_filter=SQL("bug_id in {{bugs}}", {"bugs":db.quote_value(some_bugs)}) Remove or add comment as to why this is commented out. Also, weird spacing. @@ +125,5 @@ > + except Exception, e: > + Log.warning("Total failure during compare of bugs {{bugs}}", {"bugs": some_bugs}, e) > + > + > + def test_private_etl(self): Don't double-space method definitions within a class. @@ +145,5 @@ > + > + def test_public_etl(self): > + """ > + > + """ Empty docstring... @@ +475,3 @@ > ref_versions = \ > Q.sort( > + map(# map-lambda ADDED TO FIC OLD PRODUCTION BUG VERSIONS I know this is pre-existing code, but why is this not a list comprehension? If this comment explains it, I don't quite understand. :) ::: tests/util/database.py @@ +115,5 @@ > db.insert("bugs_activity", activity) > > + db.execute("UPDATE bugs SET delta_ts={{now}} WHERE {{where}}", { > + "now":now, > + "where":db.esfilter2sqlwhere({"term":{"bug_id":old_record.bug_id}}) Spacing. ::: tests/util/elasticsearch.py @@ +41,4 @@ > > > def search(self, query): > + f=CNV.esfilter2where(struct.wrap(query).query.filtered.filter) Spacing.
Attachment #8350054 - Flags: review?(mcote) → review+
Fixes (no code changes, all tests pass): https://github.com/klahnakoski/Bugzilla-ETL/commit/cccb89f3ab27c4abc1f61baa2199aa1a781e71b0 > "is None" is the general convention. Yes, but I have the NullStruct floating around my code. Using x == None does a more complex comparison (https://github.com/klahnakoski/Bugzilla-ETL/blob/master/bzETL/util/struct.py#L228) > Random comment: might be good to explain why this file is preferable to the standard json module. Thank you. Added comment (https://github.com/klahnakoski/Bugzilla-ETL/commit/cccb89f3ab27c4abc1f61baa2199aa1a781e71b0#diff-388b6b027188a75b8fe851ad96a9d01cR11) > @@ +148,2 @@ > > ESCAPE_DCT = { > > + u"\\": u"\\\\", > > Any reason you switched all the single quotes to double quotes here? Only for the sake of uniformity with other lines. I had to double-take on the backslash and whether I got it correct. > ::: bzETL/util/maths.py > @@ +113,5 @@ > > return output > > > > + @staticmethod > > + def ceiling(value): > > + return math.ceil(value) > > Why bother with this? Why not use math.ceil() directly where this is being called? One of my many weaknesses is a terrible memory for the multitude of short forms. In this case I keep forgetting that the "ceiling" function is missing "ing". After having to look this up twice, and even wondering if Python had such a function, I added a full name version here so I do not need to remember this again. Furthermore, one day, I will add "mod" parameter so I can perform ceiling() modulo some base value. > @@ +475,3 @@ > > ref_versions = \ > > Q.sort( > > + map(# map-lambda ADDED TO FIC OLD PRODUCTION BUG VERSIONS > > I know this is pre-existing code, but why is this not a list comprehension? If this comment > explains it, I don't quite understand. :) Thank you! List comprehension should have been used. I must have railroaded myself into using map() for some reason I long forgot. Changed is in a future version. > > + > > + Remember that this works by creating a lock file with a filename based on the full path to > the script file. > > + """ > > + def __init__(self, flavor_id=""): > > + import sys > > Why in the constructor? Good question! Will have it fixed in another version.
Assignee: nobody → klahnakoski
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: