Closed Bug 1308787 Opened 3 years ago Closed 3 years ago

Certificate Transparency - script for generating the static list of known logs

Categories

(Core :: Security: PSM, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox53 --- fixed

People

(Reporter: sergei.cv, Assigned: sergei.cv)

References

(Blocks 1 open bug)

Details

(Whiteboard: [psm-assigned])

Attachments

(1 file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36
Based on the JSON file downloaded from https://www.certificate-transparency.org/known-logs, the script will generate the security/certverifier/CTKnownLogs.h header file listing the known CT logs.

The logs list should eventually include the disqualified logs (see https://www.chromium.org/Home/chromium-security/certificate-transparency), but this information is not currently available in the downloadable JSON file and will be added manually to the generated file.

The script will be implemented in Python to ease integration into the build system at later stages.
Blocks: 1281469
Component: Untriaged → Security
Comment on attachment 8808582 [details]
Bug 1308787 - Certificate Transparency - script for generating the static list of known logs.

https://reviewboard.mozilla.org/r/91386/#review91370

This looks good to me, but I'm doing to ask Cykesiopka to also have a quick look since I'm not extremely familiar with reviewing python.

::: security/manager/tools/getCTKnownLogs.py:55
(Diff revision 1)
> +
> +def get_multiline_hex(blob, width):
> +    text = "".join([ "\\x%.2x" % ord(c) for c in blob ])
> +    # When escaped, a single byte takes 4 chars (e.g. "\x00").
> +    # Make sure we don't break an escaped byte between the lines.
> +    return textwrap.wrap(text, width - width % 4)

This is going to come out to 72, right? Why not just hard-code that?

::: security/manager/tools/getCTKnownLogs.py:58
(Diff revision 1)
> +    # When escaped, a single byte takes 4 chars (e.g. "\x00").
> +    # Make sure we don't break an escaped byte between the lines.
> +    return textwrap.wrap(text, width - width % 4)
> +
> +def get_log_info_struct_initializers(input):
> +  tmpl = Template(textwrap.dedent("""\

nit: 4 space indentation for this function

::: security/manager/tools/getCTKnownLogs.py:80
(Diff revision 1)
> +
> +def generate_cpp_header_file(input, f):
> +    include_guard = os.path.basename(f.name).replace('.', '_').replace('/', '_')
> +    log_info_initializers = get_log_info_struct_initializers(input)
> +    f.write(Template(OUTPUT_TEMPLATE).substitute(
> +        prog=sys.argv[0],

Should we just use the basename here so the comment doesn't include the path to the utility?
Attachment #8808582 - Flags: review?(dkeeler) → review+
Comment on attachment 8808582 [details]
Bug 1308787 - Certificate Transparency - script for generating the static list of known logs.

https://reviewboard.mozilla.org/r/91386/#review91370

> This is going to come out to 72, right? Why not just hard-code that?

I think it's nice to have the line width calculation on a single line (where get_multiline_hex is called). 80 - 6 = 74 which is passed to get_multiline_hex.
Comment on attachment 8808582 [details]
Bug 1308787 - Certificate Transparency - script for generating the static list of known logs.

https://reviewboard.mozilla.org/r/91386/#review91986

This looks good - I just have some minor comments.

::: security/manager/tools/getCTKnownLogs.py:1
(Diff revision 2)
> +#!/usr/bin/env python

This file should have the executable bit set, so we don't have to manually set the correct permissions every time.
(The line itself is fine, I'm just abusing it.)

::: security/manager/tools/getCTKnownLogs.py:17
(Diff revision 2)
> +from __future__ import print_function
> +from string import Template
> +import argparse
> +import base64
> +import json
> +import math

This appears to be unused.

::: security/manager/tools/getCTKnownLogs.py:52
(Diff revision 2)
> +
> +#endif // $include_guard
> +"""
> +
> +def get_multiline_hex(blob, width):
> +    text = "".join([ "\\x%.2x" % ord(c) for c in blob ])

Let's make this something like
```py
text = "".join(["\\x{:02x}".format(ord(c)) for c in blob])
```
instead.

Here's the reasoning:
- Specifying a precision via . is weird when the input should always be an integer.
- format() is nicer than %, IMO. We should also try to be consistent with using either % or format() if possible.
- The whitespace inside the brackets is unnecessary.

::: security/manager/tools/getCTKnownLogs.py:57
(Diff revision 2)
> +    text = "".join([ "\\x%.2x" % ord(c) for c in blob ])
> +    # When escaped, a single byte takes 4 chars (e.g. "\x00").
> +    # Make sure we don't break an escaped byte between the lines.
> +    return textwrap.wrap(text, width - width % 4)
> +
> +def get_log_info_struct_initializers(input):

`input` here redefines the built-in `input()` and isn't a name that makes it really obvious what type of input it is, so let's call it something else.
`json_input` maybe.

::: security/manager/tools/getCTKnownLogs.py:61
(Diff revision 2)
> +
> +def get_log_info_struct_initializers(input):
> +    tmpl = Template(textwrap.dedent("""\
> +          { $description,
> +            $url,
> +        $idented_log_key,

Typo: idented.

::: security/manager/tools/getCTKnownLogs.py:77
(Diff revision 2)
> +            idented_log_key="\n".
> +              join(['    "%s"' % l for l in get_multiline_hex(log_key, 74)]),
> +            log_key_len=len(log_key)))
> +    return initializers
> +
> +def generate_cpp_header_file(input, f):

Same `input` comment. `f` also isn't very descriptive.

::: security/manager/tools/getCTKnownLogs.py:78
(Diff revision 2)
> +              join(['    "%s"' % l for l in get_multiline_hex(log_key, 74)]),
> +            log_key_len=len(log_key)))
> +    return initializers
> +
> +def generate_cpp_header_file(input, f):
> +    include_guard = os.path.basename(f.name).replace('.', '_').replace('/', '_')

Nit: We should be consistent with what type of string quotes we use.

::: security/manager/tools/getCTKnownLogs.py:87
(Diff revision 2)
> +        include_guard=include_guard,
> +        logs=",\n".join(log_info_initializers)))
> +
> +def run(args):
> +    if args.file:
> +        print("Reading file: {0}".format(args.file))

Nit: We're using `print_function`, so let's take advantage of that and make this nicer like so:
```py
print("Reading file:", args.file)
```
Same elsewhere.

::: security/manager/tools/getCTKnownLogs.py:88
(Diff revision 2)
> +        logs=",\n".join(log_info_initializers)))
> +
> +def run(args):
> +    if args.file:
> +        print("Reading file: {0}".format(args.file))
> +        with open(args.file, "rb") as f:

Same `f` comment as above.

::: security/manager/tools/getCTKnownLogs.py:108
(Diff revision 2)
> +    parser = argparse.ArgumentParser(
> +        description=
> +          "Parses a JSON file listing the known "
> +          "Certificate Transparency logs and generates "
> +          "a C++ header file to be included in Firefox.",
> +        epilog=
> +          "Example: python %s --url" % os.path.basename(sys.argv[0]))

Nit: The style guide says we should generally follow PEP-8.
The indentation here is slightly odd, so I recommend using |pylint| and/or |flake8| to help make this code conform to PEP-8 more.

::: security/manager/tools/getCTKnownLogs.py:133
(Diff revision 2)
> +    parser.add_argument("--out",
> +        default=
> +          "../../certverifier/CTKnownLogs.h",
> +        help=
> +          "Path and filename of the header file to be generated. "
> +          "Defaults to %(default)s")

Up to you, but adding `formatter_class=argparse.ArgumentDefaultsHelpFormatter` to the `ArgumentParser` saves you the work of having to write this message manually.
Attachment #8808582 - Flags: review?(cykesiopka.bmo) → review+
Assignee: nobody → sergei.cv
Status: NEW → ASSIGNED
Component: Security → Security: PSM
Priority: -- → P1
Product: Firefox → Core
Whiteboard: [psm-assigned]
Comment on attachment 8808582 [details]
Bug 1308787 - Certificate Transparency - script for generating the static list of known logs.

https://reviewboard.mozilla.org/r/91386/#review91986

> Nit: The style guide says we should generally follow PEP-8.
> The indentation here is slightly odd, so I recommend using |pylint| and/or |flake8| to help make this code conform to PEP-8 more.

Used both pylint and flake8, all the warnings are fixed now (besides "Invalid module name getCTKnownLogs" from pylint which is probably OK).

> Up to you, but adding `formatter_class=argparse.ArgumentDefaultsHelpFormatter` to the `ArgumentParser` saves you the work of having to write this message manually.

Leaving it as is since the text produced by ArgumentDefaultsHelpFormatter does not look right.
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/411dbec99c3f
Certificate Transparency - script for generating the static list of known logs. r=Cykesiopka,keeler
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/411dbec99c3f
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
You need to log in before you can comment on or make changes to this bug.