Closed Bug 1248845 Opened 9 years ago Closed 9 years ago

Investigate get_records inconsistency with object size and compression types

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: whd, Assigned: rvitillo)

References

Details

Wesley Dawson [:whd]

Reporter

Description

•

9 years ago

Following up on bug #1231410. I've uploaded various versions of the data, the parameters being snappy/no snappy and 50-250MB chunks / single object. Here's the results of calling get_records on each variation: In [3]: records = get_records(sc, "telemetry-webrtc", submissionDate="20160101-protobuf-single"); records.count() Out[3]: 32371 In [4]: records = get_records(sc, "telemetry-webrtc", submissionDate="20160101-snappy-single"); records.count() Out[4]: 1838 In [5]: records = get_records(sc, "telemetry-webrtc", submissionDate="20160101-protobuf-100mb"); records.count() Out[5]: 34300 In [6]: records = get_records(sc, "telemetry-webrtc", submissionDate="20160101-snappy-100mb"); records.count() Out[6]: 32828 In [7]: records = get_records(sc, "telemetry-webrtc", submissionDate="20160101-snappy-250mb"); records.count() Out[7]: 14497 In [8]: records = get_records(sc, "telemetry-webrtc", submissionDate="20160101-protobuf-250mb"); records.count() Out[8]: 31476 In [9]: records = get_records(sc, "telemetry-webrtc", submissionDate="20160101-snappy-50mb"); records.count() Out[9]: 34327 In [10]: records = get_records(sc, "telemetry-webrtc", submissionDate="20160101-protobuf-50mb"); records.count() Out[10]: 34327 $ heka-cat -format count output.log Input:output.log Offset:0 Match:TRUE Format:count Tail:false Output: Processed: 34327, matched: 34327 messages So when the chunks are substantially smaller than _chunk_size (200MB), we see all the records, but the larger the object size, the fewer records returned, and the problem is more apparent with snappy-encoded records.

Wesley Dawson [:whd]

Reporter

Comment 1

•

9 years ago

:rvitillo, can you take a look at this? I will continue to look at this next week, but you might be able to figure it out in a more timely fashion.

Flags: needinfo?(rvitillo)

Roberto Agostino Vitillo (:rvitillo)

Assignee

Comment 2

•

9 years ago

I can take this.

Flags: needinfo?(rvitillo)

Roberto Agostino Vitillo (:rvitillo)

Assignee

Updated

•

9 years ago

Assignee: nobody → rvitillo

Roberto Agostino Vitillo (:rvitillo)

Assignee

Updated

•

9 years ago

Points: --- → 2

Priority: -- → P2

Roberto Agostino Vitillo (:rvitillo)

Assignee

Updated

•

9 years ago

Blocks: 1255748

Anthony Zhang [:azhang] (last day at Mozilla: 2016-04-29)

Comment 3

•

9 years ago

I did a bit of digging around;, and here are the initial results: https://gist.github.com/Uberi/a3a92bb011c7f3b0e8dc677c91471a10 It seems like telemetry.utils.heka_message.unpack can't backtrack properly in the Snappy files. For some reason, it works for Protobuf. I'll need to look at it some more to find the exact cause, but I suspect this is a bug in the Snappy library.

Anthony Zhang [:azhang] (last day at Mozilla: 2016-04-29)

Comment 4

•

9 years ago

Ref: https://github.com/mozilla/telemetry-tools/pull/4

Roberto Agostino Vitillo (:rvitillo)

Assignee

Comment 5

•

9 years ago

In the end we decided to remove Heka file chunking altogether, which doesn't work correctly with Snappy encoding. It was originally introduced to reduce the memory pressure. Since then some configuration changes have landed that deal with the issue in a more general way so that chunking should be no longer required. https://github.com/mozilla/python_moztelemetry/pull/62 https://github.com/mozilla/telemetry-tools/pull/5

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

BMO Automation

Updated

•

6 years ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Investigate get_records inconsistency with object size and compression types

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P2)

Tracking

(Not tracked)

People

(Reporter: whd, Assigned: rvitillo)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated

Updated

Updated

Comment 3

Comment 4

Comment 5

Updated