Dataset API returning empty RDD

RESOLVED FIXED

Status

Cloud Services
Metrics: Pipeline
P1
blocker
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: frank, Assigned: wlach)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

a year ago
No matter what I put into get_pings, the result is now empty. All the mobile jobs failed because of this.
(Reporter)

Comment 1

a year ago
I've realized the Dataset API as a whole is broken:

>> Dataset.from_source('telemetry').where(docType = 'main').records(sc).count()
0
Severity: critical → blocker
Summary: get_pings returning empty RDD → Dataset API returning empty RDD
(Reporter)

Comment 2

a year ago
Just to be clear, the summaries() method is returning no files.
William, could you please verify if this is caused by the new version of python_moztelemetry that we released yesterday?
Flags: needinfo?(wlachance)
Frank, I can't seem to replicate this:

```
len(Dataset.from_source('telemetry').where(docType='main', submissionDate="20170401").summaries(sc)) 
143469
```
Frank, could you double check this is still broken for you per comment 4? Happy to look into this if so.
Flags: needinfo?(wlachance) → needinfo?(fbertsch)
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #4)
> Frank, I can't seem to replicate this:
> 
> ```
> len(Dataset.from_source('telemetry').where(docType='main',
> submissionDate="20170401").summaries(sc)) 
> 143469
> ```

Please ignore this; turns out I tried that query on a cluster that was spawned before the new version of python_moztelemetry was released.
I've removed release 0.6.7 from pypi while I investigate this.
Assignee: nobody → wlachance
Flags: needinfo?(fbertsch)
Fixed with: https://github.com/mozilla/python_moztelemetry/pull/140
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.