Closed Bug 1353965 Opened 7 years ago Closed 7 years ago

Dataset API returning empty RDD

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: frank, Assigned: wlach)

References

Details

No matter what I put into get_pings, the result is now empty. All the mobile jobs failed because of this.
I've realized the Dataset API as a whole is broken:

>> Dataset.from_source('telemetry').where(docType = 'main').records(sc).count()
0
Severity: critical → blocker
Summary: get_pings returning empty RDD → Dataset API returning empty RDD
Just to be clear, the summaries() method is returning no files.
William, could you please verify if this is caused by the new version of python_moztelemetry that we released yesterday?
Flags: needinfo?(wlachance)
Frank, I can't seem to replicate this:

```
len(Dataset.from_source('telemetry').where(docType='main', submissionDate="20170401").summaries(sc)) 
143469
```
Frank, could you double check this is still broken for you per comment 4? Happy to look into this if so.
Flags: needinfo?(wlachance) → needinfo?(fbertsch)
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #4)
> Frank, I can't seem to replicate this:
> 
> ```
> len(Dataset.from_source('telemetry').where(docType='main',
> submissionDate="20170401").summaries(sc)) 
> 143469
> ```

Please ignore this; turns out I tried that query on a cluster that was spawned before the new version of python_moztelemetry was released.
I've removed release 0.6.7 from pypi while I investigate this.
Assignee: nobody → wlachance
Flags: needinfo?(fbertsch)
Fixed with: https://github.com/mozilla/python_moztelemetry/pull/140
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.