Closed Bug 935488 Opened 11 years ago Closed 11 years ago

Integrate c++ record writer code

Categories

(Webtools Graveyard :: Telemetry Server, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mreid, Assigned: rvitillo)

Details

Attachments

(3 files, 2 obsolete files)

We need to integrate Jonas's RecordWriter code into the main telemetry-server repository.
The to-be-integrated code is here:
https://github.com/jonasfj/telemetry-server/blob/process-incoming-cpp/process_incoming/worker/RecordWriter.cpp

The new code should replace the stubbed out RecordWriter.cpp here:
https://github.com/mozilla/telemetry-server/blob/master/process_incoming/worker/common/RecordWriter.cpp

Note there are some other supporting classes in :jonasfj's branch that need to be moved around / tweaked (data types etc) for integration.

Attached is a small sample input file and tarball the expected output files. The generated UUIDs will vary, but the file contents should match.
Assignee: nobody → rvitillo
Attached patch 935488.patch (obsolete) — Splinter Review
Attachment #832950 - Flags: review?(mreid)
Attachment #832950 - Flags: review?(mreid) → review?(mtrinkala)
Comment on attachment 832950 [details] [diff] [review]
935488.patch

Review of attachment 832950 [details] [diff] [review]:
-----------------------------------------------------------------

- In Utils.cpp/h directory tree creation and UUID generation are both available in Boost.
- The Writer returns a bool but it is not checked and there is no access to the underlying error message/cause.
- All errors are spewed to stderr in an ad-hoc format making them only human consumable (if they were captured)
- Not sure why the Mozilla Public License headers were removed from the source files - http://www.mozilla.org/MPL/headers/
- It looks like there is some white space at the end of lines (some of it is from me... checking my editor settings); That should be cleaned up.
- Might be useful to track some metrics on the writer. i.e., number of files generated, total size, compressed size, any relevant timings, and output them as a Heka message.

I can take a better look at the implementation tomorrow, if desired, this was just a quick pass.
Attached patch Integrate c++ record writer code (obsolete) — Splinter Review
Attachment #832950 - Attachment is obsolete: true
Attachment #832950 - Flags: review?(mtrinkala)
Attachment #8335327 - Flags: review?(mtrinkala)
(In reply to Mike Trinkala [:trink] from comment #3)
> Comment on attachment 832950 [details] [diff] [review]
> - Might be useful to track some metrics on the writer. i.e., number of files
> generated, total size, compressed size, any relevant timings, and output
> them as a Heka message.

I have still to add the metrics to the writer; the rest should be there.
(In reply to Roberto A. Vitillo (:rvitillo) from comment #5)
> (In reply to Mike Trinkala [:trink] from comment #3)
> > Comment on attachment 832950 [details] [diff] [review]
> > - Might be useful to track some metrics on the writer. i.e., number of files
> > generated, total size, compressed size, any relevant timings, and output
> > them as a Heka message.
> 
> I have still to add the metrics to the writer; the rest should be there.

And I have just discovered a bug in the compressor, so I will have to fix that too.
Attached patch 935488.diffSplinter Review
Attachment #8335327 - Attachment is obsolete: true
Attachment #8335327 - Flags: review?(mtrinkala)
Attachment #8336239 - Flags: review?(mtrinkala)
Comment on attachment 8336239 [details] [diff] [review]
935488.diff

Review of attachment 8336239 [details] [diff] [review]:
-----------------------------------------------------------------

::: CMakeLists.txt
@@ -18,3 @@
>  find_package(OpenSSL REQUIRED)
>  find_package(Protobuf 2.3 REQUIRED)
> -find_package(Boost 1.51.0 REQUIRED 

This needs to be updated to 1.54.0 since boost::log is now being pulled in.

::: cmake/mozsvc.cmake
@@ -24,4 @@
>  else()
>      # Predefined Macros: clang|gcc -dM -E -x c /dev/null
>      # Compiler options: http://gcc.gnu.org/onlinedocs/gcc/Invoking-GCC.html#Invoking-GCC
> -    set(CMAKE_C_FLAGS   "-std=c11 -pedantic -Werror -Wall -Wextra -fPIC")

Globally suppressing deprecated errors seem like a bad idea, what are we using that is deprecated and why?

::: process_incoming/worker/common/CompressedFileWriter.cpp
@@ +190,5 @@
> +
> +CompressedFileWriter::~CompressedFileWriter()
> +{
> +  // We should finalize before destruction
> +  assert(!mFile);

This will fail silently on a release build.  

If the file is still open is there a reason we don't finalize it on destruction?

::: process_incoming/worker/common/CompressedFileWriter.h
@@ +15,5 @@
> +
> +/** Buffer output buffer size, before writing to file */
> +#define BUF_SIZE BUFSIZ
> +
> +namespace mozilla {

Is CompressedFileWriter unique enough to put in the top level mozilla namespace?  The other components are in mozilla::telemetry.

@@ +33,5 @@
> +   * Initialize CompressedFileWriter given an LZMA compression level, a number
> +   * between 0 and 9.
> +   * See preset option in xz(1) for more details.
> +   */
> +  bool Initialize(FILE *aFile, uint32_t preset = 0);

Use aPreset

::: process_incoming/worker/common/RecordWriter.cpp
@@ +434,5 @@
>  {
> +  bool retval = true;
> +
> +  // Serialize each file
> +  for (auto it = mFileMap.begin(); it != mFileMap.end(); it++) {

Probably doesn't make much difference these days but I usually pull the end() call out of the condition to avoid repeating the function call and prefix increment the iterator to avoid the temporary variable.

@@ +463,4 @@
>  {
> +  // Create a list of files that can be compressed
> +  vector<OutputFile*> files;
> +  for(auto kv : mFileMap) {

Some more C++11 :)
Wouldn't auto &kv prevent creating a copy of the pair?

@@ +528,5 @@
> +{
> +  boost::uuids::uuid u = boost::uuids::random_generator()();
> +  std::stringstream ss;
> +  ss << u;
> +  std::string ret = ss.str();

You can just use to_string(u) here (don't see a way to not to generate the hyphens though)

::: process_incoming/worker/common/RecordWriter.h
@@ +23,5 @@
>  /**
> + * RecordWriter is responsible for writing records into temporary files stored
> + * in mWorkFolder, compressing and moving them to mUploadFolder, when the files
> + * grow large enough or RecordWriter::Finalize() is called.
> + * The RecordWrite will attempt to do on-the-fly compression of as many files

RecordWrite typo

@@ +46,2 @@
>     */
> +  RecordWriter(const std::string& aWorkFolder, const std::string& aUploadFolder,

Why were the paths change from a path to a string type?

::: process_incoming/worker/convert.cpp
@@ +240,5 @@
>        }
>      }
> +
> +    if (logger()) {
> +      writer.Finalize();

We should Finalize whether or not logging is active.
Forgot to add to the overall comments:
1) where are the unit tests?
2) if possible a github pull request is preferable to a diff (I find it a lot easier to work with)
(In reply to Mike Trinkala [:trink] from comment #8)
> Review of attachment 8336239 [details] [diff] [review]:
> ::: cmake/mozsvc.cmake
> @@ -24,4 @@
> >  else()
> >      # Predefined Macros: clang|gcc -dM -E -x c /dev/null
> >      # Compiler options: http://gcc.gnu.org/onlinedocs/gcc/Invoking-GCC.html#Invoking-GCC
> > -    set(CMAKE_C_FLAGS   "-std=c11 -pedantic -Werror -Wall -Wextra -fPIC")
> 
> Globally suppressing deprecated errors seem like a bad idea, what are we
> using that is deprecated and why?

MD5 is deprecated on OSX 10.7:
/tmp/telemetry-server/process_incoming/worker/common/HistogramCache.cpp:182:3: warning: 'MD5' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
  MD5((unsigned char*)json.c_str(), json.size(), digest);

> ::: process_incoming/worker/convert.cpp
> @@ +240,5 @@
> >        }
> >      }
> > +
> > +    if (logger()) {
> > +      writer.Finalize();
> 
> We should Finalize whether or not logging is active.

Finalize is called during the destruction of the writer in any case.
Comment on attachment 8336239 [details] [diff] [review]
935488.diff

Review of attachment 8336239 [details] [diff] [review]:
-----------------------------------------------------------------

r+
Comments on the PR
Attachment #8336239 - Flags: review?(mtrinkala) → review+
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: