Closed Bug 1541022 Opened 6 years ago Closed 6 years ago

Write glean pings to a temporary file and then move them to final location

Categories

(Toolkit :: Telemetry, defect, P1)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: travis_, Assigned: Dexter)

References

Details

(Whiteboard: [telemetry:mobilesdk:m7])

Attachments

(1 file)

Glean.sendPings causes some async business to happen which ultimately causes a PingUploadWorker to be enqueued in WorkManager. It may be possible that quick, successive calls to sendPings() could cause problems since it replaces the existing worker.

Whiteboard: [telemetry:mobilesdk:m?] → [telemetry:mobilesdk:m7]
Assignee: nobody → alessio.placitelli
Priority: P3 → P1

This doesn't seem to be a problem: our worker policy is now KEEP, so the worker doesn't get replaced or killed when calling sendPings, so having Synchronized there wouldn't help. However, the analysis helped uncover a few more issues:

(1) if we loose network connection, given that that's part of our worker constraints, the worker gets killed.

That's a documented behaviour of the WorkManager. I think that this can cause holes in our sequence numbers/data loss. Because:

This ends up deleting pings when network is down, while the worker is running. This is not highly likely to happen, but there's a chance that will happen.

(2) when writing pings, we might end up with partially written files to disk if the PingStorageEngine.process scans the directory while a write is happening.
As far as I can see, writing to disk writes to the final ping location, not to a temporary location and then moves the file there; no transactional-like behaviour.
There is a chance that the upload worker can see partial pings while scanning the directory, thus uploading corrupted data.

See Also: → 1543728

(In reply to Alessio Placitelli [:Dexter] from comment #1)

(2) when writing pings, we might end up with partially written files to disk if the PingStorageEngine.process scans the directory while a write is happening.
As far as I can see, writing to disk writes to the final ping location, not to a temporary location and then moves the file there; no transactional-like behaviour.
There is a chance that the upload worker can see partial pings while scanning the directory, thus uploading corrupted data.

Morphing this bug to handle the second case.

Summary: glean: investigate whether `Glean.sendPings()` should be `@Synchronized` → Write glean pings to a temporary file and then move them to final location
Attached file GitHub Pull Request
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: